Are you passionate about building scalable data solutions that power real-time insights and machine learning? Join our team as a Software Data Engineer, where you'll design and maintain cutting-edge data architectures on Azure using Databricks.
What You’ll Do:
- Design and optimize data pipelines and ETL/ELT workflows using Databricks on Azure.
- Build and manage modern data platforms including Data Lakes, Warehouses, and Lakehouses for both streaming and batch processing.
- Integrate large-scale datasets from diverse sources using Delta Lake and other formats.
- Develop feature stores and data prep workflows for ML applications.
- Ensure data quality through validation frameworks and governance standards.
- Monitor and optimize workflows for performance, cost-efficiency, and reliability.
What We’re Looking For:
- 3+ years of hands-on experience with Databricks on Azure, Apache Spark (PySpark/Spark SQL), and Delta Lake.
- Strong knowledge of Azure Data Factory, ADLS, Synapse Analytics, Azure ML Studio.
- Proficiency in Python, SQL, Unix/Linux scripting; Java/Scala is a plus.
- Experience with streaming technologies like Apache Kafka, Azure Event Hubs, and Azure Stream Analytics.
- Familiarity with CI/CD tools (Azure DevOps, GitHub Actions), workflow orchestration (Airflow, ADF, Databricks Workflows).
Bonus Skills:
- Experience with ML frameworks (scikit-learn, TensorFlow, PyTorch) and feature engineering on Azure.
- Familiarity with LLM applications, prompt engineering, and AI agent frameworks (Azure OpenAI, Semantic Kernel).
- Containerization experience with Docker, AKS.
- Monitoring tools like Azure Monitor, Datadog, Grafana.
Education:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.