Data Scientist - NLP/Topic Modeling Engineer
Location: Greenwood Village CO (4 days onsite/1 remote)
Duration: 6-month contract with possible/likely extension
Compensation: $60/hr+ DOE, w2
Screening: Candidates must be able to complete background check and drug screen
Team / Project Description: This team handles Natural Language Processing + an internal Gen-AI tool. The newest project for the team is using a method to manage Customer Escalation Tickets. Tickets come from varying channels (call centers, NPS scores, etc.), building ML models to determine classification and delegate to the proper team for resolution.
Top Skills:
- SQL
- Python, PySpark
- AWS Familiarity
- AI/ Machine Learning experience a plus
- Ability to speak with stakeholders, business acumen
- Understanding of CI/CD
Role Overview
- Lead the topic modeling process (e.g., clustering, taxonomy) to identify new resolution categories from network to network processing
- Develop and refine classification models that map text inputs to specific resolution categories (initially 20+).
- Collaborate with Data/ML Engineers to build data pipelines and ensure model scalability for production
Key Responsibilities
- Topic Modeling & Taxonomy Development
- Perform text clustering and exploratory data analysis to uncover new resolution categories.
- Validate, refine, and finalize taxonomy in collaboration with technical and business stakeholders.
- NLP Model Development
- Design and implement classification models (NLP or GenAI-based) to categorize inputs into the identified resolution types.
- Ensure the model handles both voice-transcribed text and technician tickets effectively.
- Model Deployment & Validation
- Work with Data/ML Engineers to integrate models into production.
- Develop and maintain performance metrics and dashboards.
- Cross-Functional Collaboration
- Partner with the Data/ML Engineer to ensure smooth data ingestion, transformation, and model deployment.
- Communicate insights and recommendations to stakeholders.
Technical Skills & Experience
- NLP/LLM Expertise: Experience in text analytics, topic modeling (LDA, clustering), classification techniques, and large language models.
- Programming: Strong Python (pandas, PySpark, scikit-learn) skills; Scala/Spark exposure is a plus.
- Cloud: Familiarity with AWS (S3, EC2, EMR, or similar) and some Azure exposure (LLM hosting).
- MLOps: Experience with CI/CD for ML, model monitoring, and data version control is a plus.
- Data Exploration & Visualization: Ability to create meaningful visualizations and insights for data sets.
Additional Qualifications
- 5+ years of relevant data science experience (or equivalent).
- Strong mathematical experience
- Demonstrable GitHub portfolio showcasing NLP or other data science projects.
- Strong communication skills to translate technical findings to stakeholders.
***Note: Manager is even open to someone that just came out of school with intern experience that has gotten their own data and then used that to create model***