Data Scientist

Parkar • Full-time • Atlanta, GA, US • 23h ago

Job Title: Data Scientist

Location: Atlanta, GA (Remote)

Job Type: Long-Term Contract

About The Role

We are seeking a highly motivated and skilled Data Scientist with strong expertise in data science fundamentals, machine learning (ML), and large language models (LLMs). The ideal candidate will have hands-on experience working with Databricks and Azure ecosystems, including PySpark for data processing and LLM tuning within Databricks. This role involves building and optimizing data science solutions that leverage cloud-based technologies to deliver business value.

Key Responsibilities

Design, develop, and deploy data science and ML solutions on Databricks (Azure environment).
Work on end-to-end ML lifecycle, from data preparation and feature engineering to model training, evaluation, and deployment.
Apply LLM fine-tuning and optimization techniques within Databricks for domain-specific use cases.
Utilize PySpark for distributed data processing, cleaning, and transformation.
Collaborate with data engineers, cloud architects, and business stakeholders to ensure seamless integration of ML models into production workflows.
Conduct exploratory data analysis (EDA), statistical modeling, and hypothesis testing to extract insights from structured and unstructured data.
Stay updated on the latest advancements in AI/ML, LLMs, and Databricks capabilities to bring innovative solutions.
Document methodologies, experiments, and best practices for knowledge sharing.

Required Skills & Qualifications

Bachelor’s/Master’s degree in Computer Science, Data Science, Statistics, AI/ML, or related field.
Proven experience as a Data Scientist with exposure to ML and NLP projects.
Strong hands-on experience with Databricks on Azure (MLflow, Delta Lake, Databricks ML).
Proficiency in PySpark for large-scale data processing.
Experience in training, fine-tuning, and deploying LLMs within Databricks environment.
Strong programming skills in Python and familiarity with ML frameworks (TensorFlow, PyTorch, Scikit-learn, Hugging Face).
Solid understanding of data science workflows: data wrangling, feature engineering, model development, and evaluation.
Working knowledge of Azure cloud services (Azure Data Lake, Azure Synapse, Azure ML).
Strong problem-solving, analytical thinking, and communication skills.

Good-to-Have Skills

Experience with MLOps practices and tools (CI/CD for ML, MLflow).
Knowledge of vector databases and LLM deployment pipelines.
Familiarity with prompt engineering and RAG (Retrieval-Augmented Generation) techniques.
Exposure to generative AI projects on cloud platforms.