Data Scientist – Onsite | Texas
Key Responsibilities
- Lead the design, development, and deployment of predictive and prescriptive models to drive business impact across multiple domains.
- Apply causal inference and advanced statistical techniques (e.g., propensity score matching, A/B testing, structural equation modeling, synthetic controls) to uncover cause–effect relationships and guide data-driven decisions.
- Build and operationalize NLP solutions for unstructured text data, including entity extraction, sentiment analysis, classification, and topic modeling.
- Develop, optimize, and maintain large-scale data pipelines and analytics workflows using Azure and Databricks.
- Partner with engineering, product, and business teams to translate complex business problems into scalable data science solutions.
- Deliver clear and actionable insights through visualizations, reports, and presentations tailored to both technical and executive audiences.
- Champion best practices in model development, deployment, versioning, and monitoring across the data science lifecycle.
Required Qualifications
- 5+ years of professional experience in Data Science or Advanced Analytics.
- Strong expertise in predictive modeling, prescriptive analytics, and statistical methods (regression, classification, clustering, optimization).
- Proven experience with causal analysis (experiments, quasi-experiments, or causal inference frameworks).
- Proficiency in Natural Language Processing (NLP) using modern libraries such as HuggingFace, Spark NLP, or spaCy.
- Advanced programming skills in Python (pandas, scikit-learn, statsmodels, PySpark) and SQL.
- Hands-on experience with Databricks for large-scale data engineering and machine learning workflows.
- Strong knowledge of Azure Cloud Services (Azure ML, Azure Data Lake, Fabric, Azure SQL, Azure Functions).
- Understanding of MLOps principles (model versioning, CI/CD for ML, monitoring, reproducibility).
- Excellent communication and storytelling skills, with the ability to present insights effectively to diverse audiences.
Preferred Qualifications
- Advanced degree (MS or PhD) in Data Science, Computer Science, Statistics, Applied Mathematics, or a related field.
- Experience with deep learning frameworks (TensorFlow, PyTorch) for NLP or complex modeling tasks.
- Exposure to healthcare, life sciences, or other regulated industries emphasizing interpretability and causal analysis.
- Familiarity with reinforcement learning, prescriptive optimization, or advanced decision science methodologies.
- Contributions to open-source projects, academic publications, or recognized thought leadership in the data science community.