Job Description
We are seeking a highly skilled Python Data Engineer to join our team and play a critical role in building scalable, production-ready AI/ML systems. In this role, you will collaborate closely with Data Scientists and cross-functional teams to design, implement, and maintain robust machine learning platforms that power data-driven solutions across the organization.
Responsibilities
- Partner with Data Scientists to develop high-quality, reliable, and scalable machine learning systems.
- Design and implement frameworks, tools, and processes that streamline and automate the machine learning lifecycle.
- Automate ML pipelines, including model training, registration, deployment, promotion, and inference.
- Translate research-based models into production-grade software with fault tolerance and scalability in mind.
- Apply software architecture principles and design patterns to build resilient, reusable, and modular components.
- Establish coding standards, enforce best practices for code quality, and ensure comprehensive documentation and test coverage.
- Design and implement cloud-native architectures on AWS, with a focus on security, scalability, and cost efficiency.
- Develop Infrastructure as Code (IaC) using tools such as AWS CloudFormation, Terraform, or CDK.
- Build and maintain CI/CD pipelines for model deployment, infrastructure provisioning, and automated testing.
- Create reusable Python libraries, project templates, and infrastructure templates to accelerate Data and AI initiatives.
Required Skills & Qualifications
- 5+ years of professional programming experience in Python, with a track record of building large-scale, distributed, mission-critical systems.
- Strong expertise in developing and maintaining data pipelines and production ML workflows.
- Hands-on experience with testing, packaging, and deploying machine learning models.
- Proficiency in software engineering practices: Design Patterns, Unit Testing, Refactoring, CI/CD, version control.
- Expertise in Object-Oriented Design and Functional Programming principles.
- Experience designing and implementing distributed computing systems.
- Ability to build and scale API endpoints and microservices for ML and data applications.
- Knowledge of MLOps principles, with proven ability to operationalize ML solutions.
- Familiarity with AWS Data and AI services such as SageMaker, Lake Formation, Glue, and Athena.
Education/Certifications
- High school diploma or GED required.
Employer Info
Our client is an American natural gas and crude oil pipeline company with headquarters in Houston, Texas.
We are an equal opportunity employer and do not discriminate in hiring or employment on the basis of race, color, religion, national origin, citizenship, gender, marital status, sexual orientation, age, disability, veteran status, or any other characteristic protected by federal, state, or local law.
JOB-10045121