Submitting more applications increases your chances of landing a job.
Here’s how busy the average job seeker was last month:
Opportunities viewed
Applications submitted
Keep exploring and applying to maximize your chances!
Looking for employers with a proven track record of hiring women?
Click here to explore opportunities now!You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for
Would You Be Likely to Participate?
If selected, we will contact you via email with further instructions and details about your participation.
You will receive a $7 payout for answering the survey.
Job Summary
Synechron is seeking an experienced PySpark Data Engineer / Data Scientist to lead data pipeline development and advanced analytics initiatives within our financial data and index analytics division. This role plays a crucial part in building scalable data processing solutions, enabling data-driven insights, and supporting machine learning workflows in both batch and streaming environments. The ideal candidate will possess a strong technical foundation in big data processing, analytics, and software engineering, along with leadership capabilities to drive impactful data projects.
Software Requirements
Required Skills:
Proven expertise in Python programming, emphasizing clean, maintainable, and scalable code
Hands-on experience with PySpark in both batch and streaming workflows
Deep knowledge of data manipulation and feature engineering, including Pandas, NumPy, and visualization libraries (matplotlib, seaborn)
Experience with Spark components like Spark SQL, DataFrames, and Spark MLlib
Familiarity with data storage solutions: SQL and NoSQL databases (e.g., Hive, Cassandra)
Knowledge of ETL tools such as Apache Airflow, Jenkins, or GithHub Actions for scheduling and automation
Experience working with cloud environments, especially Azure or AWS for big data processing
Preferred Skills:
Hands-on with containerization and orchestration (Docker, Kubernetes)
Exposure to distributed storage solutions like Hadoop HDFS or Azure Data Lake
Overall Responsibilities
5 years of experience in Design, develop, and optimize large-scale data pipelines using PySpark for structured, semi-structured, and unstructured data
5 years of experience to Lead the building of ML pipelines for training, validation, and deployment of models in streaming/batch modes
Write high-quality, efficient code that supports data transformation, cleaning, and feature engineering
Collaborate with data scientists, analysts, and stakeholders to understand data requirements and deliver actionable insights
Build and maintain reusable code base and automation scripts for data processing and model validation
Monitor pipeline performance, troubleshoot issues, and implement improvements to ensure robustness and scalability
Stay up-to-date with the latest in big data processing, ML techniques, and analytics tools to improve system efficiency and analytics capabilities
Technical Skills (By Category)
Programming Languages:
Required: Python (required), PySpark (required)
Preferred: Scala, Java
Databases & Data Management:
SQL (MySQL, SQL Server), NoSQL (Cassandra, MongoDB), Hive, Data Lakes
Cloud Technologies:
Azure Data Factory, Azure Synapse, AWS Glue, S3 (preferred)
Frameworks & Libraries:
Spark MLlib, Pandas, NumPy, seaborn, matplotlib, scikit-learn (preferred)
Development Tools & Methodologies:
Jupyter, PyCharm, VSCode, Git, CI/CD (Jenkins, GitHub Actions), Airflow
Security & Data Governance:
Data privacy principles, secure data ingestion and output, compliance
Experience Requirements
7-12 years of experience in data engineering, analytics, or data science roles, with significant hands-on experience in big data processing and ML pipelines
Proven track record of building scalable data pipelines and supporting ML workflows in enterprise environments
Experience working with structured, semi-structured, and unstructured data across financial domains
Previous leadership or mentorship experience in a technical team is preferred
Day-to-Day Activities
Develop and optimize data pipelines for financial and index data using PySpark and related tools
Build ML workflows, feature engineering, and model deployment pipelines in both streaming and batch environments
Collaborate with business analysts and data scientists to refine data requirements and deliver insights
Automate data ingestion, transformation, and validation processes
Monitor system performance, troubleshoot issues, and implement tuning activities
Review code and pipeline health with peer teams, uphold best practices in software development and data security
Qualifications
Bachelor’s or Master’s degree in Computer Science, Data Science, Mathematics, or a related field
Relevant certifications in big data, cloud platforms, or analytics (preferred)
Strong portfolio showcasing data pipeline projects, analytics solutions, and ML workflows
Professional Competencies
Critical thinking and analytical problem-solving skills
Excellent communication skills for technical and non-technical audiences
Leadership qualities to guide project execution and mentor junior team members
Adaptability to new tools, frameworks, and evolving project requirements
Ability to handle multiple priorities under pressure with a focus on quality and deadlines
SYNECHRON’S DIVERSITY & INCLUSION STATEMENT
Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.
Candidate Application Notice
You'll no longer be considered for this role and your application will be removed from the employer's inbox.