hackajob has partnered with a forward-thinking tech-driven business that prioritizes innovation in its digital solutions and leverages extensive industry data to drive impactful results.
Role: NLP Data Engineer
Location: Cambridge, MA, (hybrid)
Salary: up to 180K
Minimum Qualifications:
- Bachelor's degree in Data Engineering, Computer Science, Software Engineering, or a related field.
- 5+ years of hands-on data engineering experience in production environments.
- Strong experience working with NLP, Generative AI, and unstructured data, including vector stores and semantic search.
- Experience building end-to-end ML/AI systems and overcoming high-volume, high-compute challenges.
- Proficient with big data tools (e.g., Spark, Kafka, Storm) and cloud platforms like AWS, GCP, or Azure.
- Skilled in automated testing, DevOps practices, and CI/CD pipelines (e.g., Jenkins, GitLab, CircleCI, Azure DevOps).
- Proficient in at least one major programming language (Python, Scala, Java, etc.).
- Familiar with machine learning libraries and NLP frameworks like PyTorch, TensorFlow, SpaCy, etc.
- Experience with infrastructure as code tools such as Terraform.
Preferred Qualifications:
- Master's or PhD in Data Engineering, Computer Science, Software Engineering, or related discipline
- Good understanding of ontologies and semantic harmonization of data across sources
- Experience implement Generative AI solutions a huge plus
- Proven track record of working with knowledge graphs and graph databases, and in general good understanding of database concepts
- Proficiency in semantic web technologies (SPARQL, RDF, OWL) and harmonization of data
- Experience working with complex biomedical datasets, including genomics, proteomics, and high-throughput screening
hackajob is a recruitment platform that will match you with relevant roles based on your preferences and in order to be matched with the roles you need to create an account with us.
This role requires you to be based in the US.