Job Summary
As a Data Engineer, you will design, develop, and maintain robust data pipelines and infrastructure to support data-driven decision-making. You will work closely with data analysts, data scientists, and stakeholders to ensure data availability, quality, and performance. Proficiency in Python, SQL, and Snowflake is essential for building and optimizing our cloud-based data ecosystem.
Key Responsibilities
- Data Pipeline Development: Design, build, and maintain scalable ETL/ELT pipelines using Python and Snowflake to ingest, transform, and load data from various sources.
- Data Modeling: Create and optimize database schemas, ensuring efficient storage and retrieval of structured and semi-structured data in Snowflake.
- Query Optimization: Write and optimize complex SQL queries to support analytics, reporting, and data transformations.
- Automation: Develop Python scripts to automate data workflows, integrate with APIs, and streamline data processing tasks.
- Data Quality: Implement data validation and monitoring processes to ensure accuracy, consistency, and reliability of data.
- Collaboration: Work with cross-functional teams to understand data requirements and deliver solutions that meet business needs.
- Performance Tuning: Optimize Snowflake virtual warehouses and queries for cost-efficiency and high performance.
- Documentation: Maintain clear documentation of data pipelines, schemas, and processes.
Required Skills and Qualifications
- Education: Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field (or equivalent experience).
- Experience:
- 2+ years of experience in data engineering or a related role.
- Proven expertise in building data pipelines using Python (Pandas, SQLAlchemy, PySpark, etc.).
- Strong proficiency in SQL for querying, data modeling, and optimization.
- Hands-on experience with Snowflake for data warehousing, including SnowSQL, data loading, and managing semi-structured data (e.g., JSON, Parquet).