Description
- As a Data Bricks Data Engineer, you will analyze complex data sets to extract meaningful insights and support data-driven decision-making.
- Collect, process, and analyze large datasets to identify trends, patterns, and insights.
- Develop and implement machine learning models and algorithms to solve business problems.
- Create data visualizations and dashboards to communicate findings to stakeholders.
- Collaborate with project teams to understand data requirements and deliver relevant analytical solutions.
- Ensure data accuracy and integrity by performing data validation and quality checks.
Primary Skill Required for the Role:
Level Required for Primary Skill:
- Advanced (6-9 years experience)
Job Overview:
We are seeking a skilled Data Engineer with hands-on Databricks experience to design, build, and optimize large-scale data pipelines and analytics solutions. You will work with cross-functional teams to enable scalable data processing using the Databricks Lakehouse Platform on Azure.
Key Responsibilities:
• Design and implement ETL/ELT pipelines using Databricks, Delta Lake, and Apache Spark
• Collaborate with data scientists, analysts, and stakeholders to deliver clean, reliable, and well-modeled data
• Build and manage data workflows with Databricks Jobs, Notebooks, and Workflows
• Optimize Spark jobs for performance, reliability, and cost-efficiency
• Maintain and monitor data pipelines, ensuring availability and data quality
• Implement CI/CD practices for Databricks notebooks and infrastructure-as-code (e.g., Terraform, Databricks CLI)
• Document data pipelines, datasets, and operational processes
• Ensure compliance with data governance, privacy, and security policies
Qualifications:
• Bachelor’s or Master’s in Computer Science, Data Engineering, or a related field
• 5+ years of experience in data engineering or a similar role
• Strong hands-on experience with Databricks and Apache Spark (Python, Scala, or SQL)
• Proficiency with Delta Lake, Unity Catalog, and data lake architectures
• Experience with cloud platforms (Azure, AWS, or GCP), especially data services (e.g., S3, ADLS, BigQuery)
• Familiarity with CI/CD pipelines, version control (Git), and job orchestration tools (Airflow, DB Workflows)
• Strong understanding of data warehousing concepts, performance tuning, and big data processing
Preferred Skills:
• Experience with MLflow, Feature Store, or other machine learning tools in Databricks
• Knowledge of data governance tools like Unity Catalog or Purview
• Experience integrating BI tools (Power BI, Tableau) with Databricks
• Databricks certification(s) (Data Engineer Associate/Professional, Machine Learning, etc.)