Machine Learning (ML) Data Engineer
Job Description
Client is seeking a ML/Data Engineer. This new member will be responsible for improving our ML platform and internal data for trading and risk management applications. The role leverages quantitative expertise and software engineering skills to deliver robust ML DevOps infrastructure, facilitate and ensure quality control in ML models and data, and streamline the ML model development and deployment workflows.
Job Responsibilities:
Collaborate with stakeholders (desk quants, IT, research, trading desks, and risk management) to design, implement, and support Pythonbased platform for ML model development and operations.
Sourcing, storing and curating data for the ML models.
Build shared tooling for the full life cycle of ML model development and operations (DevOps), including data curation, feature engineering, backtesting, hyperparameter tuning, model experimentation and performance monitoring.
Collaborate with desk quants and the research team and contribute to the development, deployment, and integration of ML models, providing technical guidance and support on ML DevOps.
Build and support ML model performance monitoring tools, such as automated pipeline for continuous tuning and performance monitoring, GUI apps and reports of model performance; help collect and maintain historical model performance data and analytics.
Collaborate with the desk quant and IT team to build and optimize data pipelines to source, curate, and cleanse highquality time series data for ML model development.
Proactively troubleshoot and resolve issues related to data quality, models, or ML systems to ensure consistent production performance.
Collaborate with the Technology team to support cloud integration and deployment of ML models.
Partner with the Technology team to enforce software development best practices including version control, unit testing, CI/CD, and robust deployment strategies.
Requirements:
A successful candidate must have the following skills or credentials:
Degree in Computer Science, Data Science, Statistics, Financial Engineering, or a related quantitative field.
At least 10 years of experience
A minimum of 3 years of industry experience in ML platform/data/DevOps.
At least 2 years of Lead experience or a Senior one if individual contributor
Proficient in Python, with proven experience in developing productionlevel tools and applications.
Strong working knowledge of data processing libraries such as pandas and numpy, and familiarity with machine learning frameworks (e.g., scikitlearn, PyTorch).
Demonstrated expertise in building and managing ETL pipelines and working with time series data.
Familiarity with software development best practices including version control (e.g., git), unit testing, and CI/CD.
Qualities of Ideal Candidate:
Strong analytical and problemsolving skills with the ability to work effectively under pressure and tight deadlines.
Excellent communication skills, capable of translating complex technical concepts to nontechnical stakeholders.
A collaborative team player with a proactive mindset and a passion for leveraging data to drive business innovations.
Self-starter, able to own and carry out a project end to end.
Preferred Skills:
The following skills are preferred but not strictly required:
Familiarity with data orchestration or distributed computing frameworks (e.g., Apache Airflow) along with experience on cloud platforms (e.g., AWS, GCP).
Prior experience in building and testing systematic trading and/or relative value models.
Familiarity with Pythonbased GUI frameworks (e.g., NiceGUI, PyQt, etc.) is a plus for developing internal tools.