Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
https://bayt.page.link/5szqf9SqECMW8iBLA
Back to the job results

Python, PySpark, ETL Developer

7 days ago 2026/06/30
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

Roles and Responsibilities

Data Pipeline Development
* Develop and maintain scalable batch ETL pipelines using Python and PySpark for data ingestion, transformation, and loading.
* Implement reusable transformation logic, ensuring pipelines are modular, testable, and easy to maintain.
* Optimize Spark jobs for performance (partitioning, caching, joins, shuffles) and cost efficiency.
Data Quality & Reliability
* Apply data validation checks, handle schema evolution, and ensure accuracy and completeness of processed datasets.
* Troubleshoot pipeline failures, analyze logs, and implement robust error handling and retry mechanisms.
* Monitor job runs and support operational stability through alerts, runbooks, and timely incident resolution.
Collaboration & Delivery
* Work with cross-functional teams to gather requirements, define data mappings, and deliver datasets aligned to business needs.
* Participate in code reviews, follow engineering best practices, and contribute to continuous improvement of standards and tooling.
* Document pipeline logic, dependencies, and operational procedures for smooth handovers and long-term maintainability.



Additional Responsibilities

* Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field (or equivalent practical experience).
* 2-5 years of hands-on experience building data pipelines using Python and PySpark.
* Strong understanding of ETL concepts, data transformations, and handling large-scale datasets.
* Proficiency in writing clean, maintainable code and debugging production issues.
* Working knowledge of data structures, algorithms, and software development best practices.



Technical Requirements

Technology->Analytics - Packages->Python - Big Data,Technology->Big Data - Data Processing->PySpark, ETL



Job Description

Build and scale data solutions that power smarter decisions. In this role, you'll work at the intersection of software engineering and data engineering-using Python, PySpark, and ETL to transform raw, complex datasets into reliable, analytics-ready assets. You'll collaborate closely with data engineers, analysts, and stakeholders to understand requirements, design efficient pipelines, and deliver high-quality outputs on time. If you enjoy solving performance challenges, improving data quality, and creating maintainable code that runs in production, this is a great opportunity to grow your impact. Expect a supportive, collaborative environment where ownership is encouraged, learning is continuous, and your contributions directly improve how teams access and trust data.


This job post has been translated by AI and may contain minor differences or errors.
You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.