Description
- As a Data Scientist Contractor, you will analyze complex data sets to extract meaningful insights and support data-driven decision-making.
- Collect, process, and analyze large datasets to identify trends, patterns, and insights.
- Develop and implement machine learning models and algorithms to solve business problems.
- Create data visualizations and dashboards to communicate findings to stakeholders.
- Collaborate with project teams to understand data requirements and deliver relevant analytical solutions.
- Ensure data accuracy and integrity by performing data validation and quality checks.
Job Overview:
We are seeking a skilled Data Engineer with hands-on Databricks experience to design, build, and optimize large-scale data pipelines and analytics solutions. You will work with cross-functional teams to enable scalable data processing using the Databricks Lakehouse Platform on Azure.
Key Responsibilities:
- Design and implement ETL/ELT pipelines using Databricks, Delta Lake, and Apache Spark
- Collaborate with data scientists, analysts, and stakeholders to deliver clean, reliable, and well-modeled data
- Build and manage data workflows with Databricks Jobs, Notebooks, and Workflows
- Optimize Spark jobs for performance, reliability, and cost-efficiency
- Maintain and monitor data pipelines, ensuring availability and data quality
- Implement CI/CD practices for Databricks notebooks and infrastructure-as-code (e.g., Terraform, Databricks CLI)
- Document data pipelines, datasets, and operational processes
- Ensure compliance with data governance, privacy, and security policies
Primary Skill Required for the Role:
Qualifications:
- Bachelor’s or Master’s in Computer Science, Data Engineering, or a related field
- 5+ years of experience in data engineering or a similar role
- Strong hands-on experience with Databricks and Apache Spark (Python, Scala, or SQL)
- Proficiency with Delta Lake, Unity Catalog, and data lake architectures
- Experience with cloud platforms (Azure, AWS, or GCP), especially data services (e.g., S3, ADLS, BigQuery)
- Familiarity with CI/CD pipelines, version control (Git), and job orchestration tools (Airflow, DB Workflows)
- Strong understanding of data warehousing concepts, performance tuning, and big data processing
Preferred Skills:
- Experience with MLflow, Feature Store, or other machine learning tools in Databricks
- Knowledge of data governance tools like Unity Catalog or Purview
- Experience integrating BI tools (Power BI, Tableau) with Databricks
- Databricks certification(s) (Data Engineer Associate/Professional, Machine Learning, etc.)
Level Required for Primary Skill:
- Advanced (6-9 years experience)
Job Type: Contract
Pay: Up to $49.00 per hour
Expected hours: No less than 40 per week
Application Question(s):
- Are you comfortable working on W2?
- Are you legally authorized to work in the United States, US Citizen/Green Card Holder?
- Are you willing to work on a contract to hire basis?
- How many years of work experience do you have with Azure Databricks?
- How many years of work experience do you have in data engineering or a similar role?
- How many years of work experience do you have with Apache Spark?
- How many years of work experience do you have with Python, Scala, or SQL?
- Are you proficient in Delta Lake, Unity Catalog, and data lake architectures?
- How many years of experience do you have with cloud platforms (Azure, AWS, or GCP), especially data services (e.g., S3, ADLS, BigQuery)?
- How many years of experience do you have with data warehousing concepts, performance tuning, and big data processing?
Work Location: Hybrid remote in Pleasanton, CA 94566