Are you passionate about transforming complex multimedia data into powerful machine learning datasets? We’re on the lookout for a skilled Data Engineer to operate at the fascinating intersection of data engineering and applied machine learning. In this role, you'll design and implement robust data-processing pipelines, turning raw, intricate data into clean, research-ready datasets. If you thrive on building scalable solutions and ensuring data integrity for cutting-edge ML workflows, we'd love to collaborate with you!
Responsibilities
- Design, develop, and maintain scalable data-processing pipelines for large volumes of multimedia (audio, video) and sensor data (e.g. IMU), ensuring reliability and reproducibility.
- Gather and interpret processing requirements from stakeholders, translating them into practical technical solutions and devising novel approaches where needed.
- Perform diverse data-processing operations, from mathematical transformations and filtering to feature extraction, synchronisation, and inference through ML models.
- Interface with various internal tooling, such as dataset management systems and training frameworks to prepare raw data for machine learning, including validation, transformation, and quality assurance.
- Collaborate with machine learning researchers to integrate research prototypes into production pipelines.
- Ensure compliance with data governance, security, and relevant standards.
Minimum Requirements
- Bachelor’s degree in a relevant technical field (e.g., Computer Science, Data Science) with a minimum of 3 years of industry experience in machine learning or data engineering; or equivalent combination of education and experience.
- Demonstrable programming experience in Python using common ML and data libraries, i.e., numpy, scipy, pandas.
- Proficiency in Linux and shell scripting.
- Working knowledge of audio, image, and video formats.
Preferred Experience
- Experience using PyTorch or other Python machine-learning frameworks.
- Experience with relational and graph / NoSQL databases.
- Experience using REST APIs for data interactions.
- Experience working in a research environment.
- Strong mathematical background.
Benefits
- 401(k).
- Dental Insurance.
- Health insurance.
- Vision insurance.
- We are an equal-opportunity employer and value diversity, equality, inclusion, and respect for people.
- The salary will be determined based on several factors, including, but not limited to, location, relevant education, qualifications, experience, technical skills, and business needs.
Additional Responsibilities
- Participate in OP monthly team meetings and participate in team-building efforts.
- Contribute to OP technical discussions, peer reviews, etc.
- Contribute content and collaborate via the OP-Wiki/Knowledge Base.
- Provide status reports to OP Account Management as requested.
About Us
At OP, we help you harness the power of technology for maximum impact. A technology consulting and solutions company, we offer advisory and managed services, innovative platforms, and staffing solutions across a wide range of fields including AI, cyber security, enterprise architecture, and beyond. For nearly two decades, we’ve been challenging the status quo of the consulting industry serving up fresh, ingenious thinking through a radically lean structure. Together, this strategy delivers unprecedented performance at an unparalleled pace for faster results that propel your business forward.