Data Engineer

Techgene Solutions • Full-time • New York, United States, US • 2d ago

Role: Data Engineer, Full-time role

Location: NYC, NY, OR Fort Mill, SC

This is an on-site role initially, with the possibility of transitioning to a hybrid model later.

Duration: 1+ year

Experience: 9+ yrs Lead role

****Currently, we are unable to offer sponsorship. Candidates with independent work authorization are encouraged to apply ****

Key Responsibilities:

Collaborate with cross-functional teams, including Data Scientists, Analysts, and Engineers,s to gather data requirements and build scalable data solutions.
Design, develop, and maintain complex ETL pipelines using AWS Glue and PySpark, ensuring efficient data processing across batch and streaming workloads.
Ensure data integrity, quality, and security across data pipelines, applying best practices for encryption, IAM, and compliance.
Monitor and troubleshoot pipeline issues, continuously optimizing for cost and performance across AWS services.
Stay current with advancements in AWS Glue, PySpark, and data infrastructure tools, and recommend improvements where applicable.
Deep understanding of Spark architecture, distributed processing, and performance tuning techniques.
Strong grasp of data modeling, schema design, and data warehouse concepts.
Experience with AWS data ecosystem including S3, Lambda and Glue Catalog.
Proficiency in Python (PySpark) for data transformation and automation tasks.
Familiarity with CI/CD practices and infrastructure-as-code tools such as Terraform is a plus.
Excellent communication and problem-solving skills, with the ability to work independently and in a team environment.