Submitting more applications increases your chances of landing a job.
Here’s how busy the average job seeker was last month:
Opportunities viewed
Applications submitted
Keep exploring and applying to maximize your chances!
Looking for employers with a proven track record of hiring women?
Click here to explore opportunities now!You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for
Would You Be Likely to Participate?
If selected, we will contact you via email with further instructions and details about your participation.
You will receive a $7 payout for answering the survey.
Job Purpose Responsible for analyzing, transforming, and deriving insights from large-scale data using Databricks and modern data stack technologies.
Designs and executes data pipelines using SQL and PySpark on Databricks, leveraging Unity Catalog for data governance and management.
Collaborates with cross-functional teams to build data models, perform exploratory analysis, and deliver actionable insights.
Optimizes queries on SQL Warehouses, integrates external APIs where needed, and ensures data quality and reliability across workflows.
Duties and Responsibilities • Develop, maintain, and optimize scalable data pipelines using Databricks (PySpark and SQL).
• Perform data extraction, transformation, and loading (ETL/ELT) for large and complex datasets.
• Analyse data to generate actionable insights supporting business decision-making.
• Design and manage data models and curated datasets within Unity Catalog.
• Ensure data quality, consistency, and reliability through validation and monitoring frameworks.
• Optimize query performance on SQL Warehouses for faster analytics and reporting.
• Collaborate with business stakeholders to understand requirements and translate them into data solutions.
• Integrate and process data from multiple sources, including APIs and external systems.
• Create dashboards, reports, and visualizations to communicate insights effectively.
• Document data processes, workflows, and metadata to support governance and reproducibility.
Key Decisions / Dimensions • Selection of appropriate data models, pipeline design, and transformation logic to meet business requirements efficiently.
• Optimization strategies for query performance, compute usage, and cost on Databricks SQL Warehouses and Spark workloads.
• Choice of data sources, integration approaches, and data quality rules to ensure reliable and governed datasets.
|Individual Contributor role.
Scope includes multiple data sources and domains across environments (dev/qa/prod).
Major Challenges • Handling large-scale, distributed datasets in Databricks while maintaining performance and cost efficiency of pipelines and queries.
• Ensuring data quality and consistency across multiple sources, evolving schemas, and governed environments like Unity Catalog.
• Translating ambiguous business requirements into scalable data models and actionable insights within tight timelines.
Required Qualifications and Experience Educational Qualifications: • Graduate or Post?
Graduate in Computer Science, Information Technology, or Data Science/Technologies.
Work Experience: • 0.
6–2 years of hands?
on data engineering/analyst experience.
Technical Expertise / Skills Keywords: Data Platforms & Tools: Databricks, Delta Lake, Unity Catalog, SQL Warehouses Programming & Querying: Python, PySpark, SQL, Spark SQL Data Engineering & Processing: ETL/ELT Pipelines, Data Modeling, Data Warehousing, Data Transformation, Big Data Processing, Apache Spark, Distributed Computing Cloud & Integration: Azure Data Services, Azure Data Lake Storage (ADLS), Azure Key Vault, REST API Integration DevOps & Workflow: Git, Azure DevOps, CI/CD Git Workflow Orchestration Analytics & Visualization: Exploratory Data Analysis (EDA), Dashboarding & Reporting (Power BI/Tableau)
You'll no longer be considered for this role and your application will be removed from the employer's inbox.