Senior Data Engineer
Senior Data Engineer (ML, Quality & Automation)
Join a fast-growing, product-led SaaS organization that’s shaping the future of enterprise data management. We’re on a mission to empower businesses with reliable, trusted data to fuel intelligent products and services at scale. With a collaborative, innovation-driven culture, we’re tackling complex data challenges and building cutting-edge solutions that make a global impact.
We are seeking a Senior Data Engineer (ML, Quality & Automation) to take ownership of designing and implementing automated frameworks that ensure the integrity and reliability of our data pipelines. This role is pivotal to maintaining the health of our AI and machine learning workflows, ensuring that the data powering our products is accurate, monitored, and production-ready.
You will work closely with data engineers, ML specialists, and product teams to build scalable quality checks, integrate them into modern CI/CD pipelines, and develop systems for real-time monitoring and anomaly detection. If you’re passionate about automating data validation and driving quality across complex datasets, this is a unique opportunity to shape the foundation of our data reliability strategy.
What You’ll Be Doing
- Designing and implementing automated data quality checks for training data, features, and real-time inputs across ML pipelines.
- Building and maintaining data validation frameworks using Python, SQL, and modern tools such as Great Expectations, dbt, or Apache Deequ.
- Integrating quality validation steps into CI/CD workflows using tools like Airflow, MLflow, or Kubeflow.
- Developing anomaly detection and drift monitoring systems to ensure data integrity across pipelines.
- Creating dashboards, reports, and alerts to provide visibility and early warnings on data issues.
- Collaborating with stakeholders across engineering, data science, and product to define KPIs and enforce data quality governance.
What We’re Looking For
We’re looking for someone who is as comfortable writing Python scripts as they are collaborating with cross-functional teams to drive data strategy. You should be able to take ownership of data quality across the organization and continuously innovate on how we monitor and maintain our systems.
Core skills and experience:
- 7 + years of experience in data engineering, MLOps, or data quality automation.
- Proficiency in Python and SQL.
- Experience with data validation frameworks such as Great Expectations, Apache Deequ, or dbt.
- Hands-on experience with CI/CD workflows and orchestration tools (Airflow, MLflow, Kubeflow).
- Knowledge of machine learning workflows, including training, serving, and feature pipelines.
- Familiarity with anomaly detection and data drift monitoring.
- Exposure to cloud-based data platforms such as Snowflake, BigQuery, or AWS S3.
Nice to have:
- Experience with data observability platforms (e.g., Monte Carlo, Soda, Datafold).
- Knowledge of data governance practices and compliance workflows.
- Familiarity with alerting and monitoring tools (Prometheus, Grafana, or similar).
Why Join?
Be at the forefront of data reliability engineering in an AI-driven environment, collaborating with a talented and forward-thinking team in a fast-paced SaaS environment, while working on cutting-edge data and ML infrastructure at scale.