Sr. AI Data Engineer

ECU Health • Full-time • Remote (Greenville, North Carolina, United States) • $96.82k - $159.77k / year • 2m ago

ECU Health

About ECU Health

ECU Health is a mission-driven, 1,708-bed academic health care system serving more than 1.4 million people in 29 eastern North Carolina counties. The not-for-profit system is comprised of 13,000 team members, nine hospitals and a physician group that encompasses over 1,100 academic and community providers practicing in over 180 primary and specialty clinics located in more than 130 locations.

The flagship ECU Health Medical Center, a Level I Trauma Center, and ECU Health Maynard Children's Hospital serve as the primary teaching hospitals for the Brody School of Medicine at East Carolina University. ECU Health and the Brody School of Medicine share a combined academic mission to improve the health and well-being of eastern North Carolina through patient care, education and research.

Position Summary

The AI Data Engineer will be responsible for designing, developing, and maintaining data pipelines and infrastructure to support AI and machine learning applications. This role involves collaborating with various departments to ensure data is readily available, clean, and formatted for optimal use by AI models. The AI Data Engineer will play a crucial role in bridging the gap between raw data and actionable insights, contributing to the company's digital transformation and innovation efforts.

Responsibilities

Design and implement ETL processes to extract, transform, and load data from diverse sources into central data storage systems such as data warehouses or data lakes and implement batch and real-time data pipelines using Azure Data Factory, Azure Synapse Pipelines, and Azure Stream Analytics.
Develop scalable data pipelines that can handle large volumes of data with high velocity and variety within an Epic EHR system.
Ensure data quality and integrity by implementing data validation and cleansing procedures (e.g., SQL Server, Blob Storage, Event Hubs, IoT Hub, APIs) into Azure Data Lake Gen2 or Synapse Analytics.
Select and manage appropriate data storage solutions, including SQL, NoSQL, and cloud-based data warehouses using Delta Lake and Parquet formats to enable performant, versioned data storage for ML training.
Build reusable, modular pipeline components using Azure Data Factorys Data Flows and custom Azure Functions.
Comprehensive understanding of compliance frameworks (HIPAA) and security using RBAC, Private Link, Key Vault integration, and data masking to secure sensitive data in AI pipelines.
Configure and maintain data processing platforms like Databricks and Azure Fabric.
Automate data infrastructure operations, including pipeline deployment, monitoring, and maintenance.
Collaborate with data scientists, machine learning engineers, and other stakeholders to support AI model development and deployment.
Monitor and optimize the performance of data pipelines and infrastructure to ensure efficient data processing and storage.
Stay up to date with the latest advancements in AI and data engineering technologies and methodologies.
Azure Services: Data Factory, Synapse, Data Lake Gen2, Stream Analytics, Event Hubs, Azure ML, Key Vault, Purview
Big Data & Processing: Azure Databricks, PySpark, Delta Lake
Languages: Python, SQL, Scala (optional)
CI/CD & IaC: Azure DevOps, GitHub Actions, Terraform, Bicep
Monitoring & Logging: Azure Monitor, Log Analytics, Application Insights
Governance & Cataloging: Microsoft Purview, Azure Policy

Minimum Requirements

Bachelor's degree or higher in computer science, data science, engineering, mathematics, or a related field with 3 years of experience, (with one year of work experience in an environment where HIPPA compliance is demonstrated.) or high school diploma or higher with 5 years of equivalent practical work experience, (with 2 years of work experience in an environment where HIPPA compliance is demonstrated.)
Proven experience in infrastructure as code (e.g., Terraform, CloudFormation).
Proven experience in data pipeline development and ETL processes.
Expertise in data storage systems, including SQL, NoSQL, data lakes, and data warehouses.
Proficiency in tools like Apache Kafka, Apache Spark, Airflow, and similar platforms.
Cloud platform expertise, including Azure, AWS, and Google Cloud.
Excellent problem-solving skills and attention to detail.
Strong communication and collaboration skills to work effectively with cross-functional teams.
Experience in a healthcare environment, including familiarity with healthcare data management and regulations.
Understanding of Health Insurance Portability and Accountability Act (HIPAA) compliance.
Ability to work collaboratively in a team environment.

Preferred Certifications can include but are not limited to:

Microsoft DP-203 (Azure Data Engineer)
Microsoft AI-102 (Certified Azure AI Engineer)

General Statement

It is the goal of ECU Health and its entities to employ the most qualified individual who best matches the requirements for the vacant position.

Offers of employment are subject to successful completion of all pre-employment screenings, which may include an occupational health screening, criminal record check, education, reference, and licensure verification.

We value diversity and are proud to be an equal opportunity employer. Decisions of employment are made based on business needs, job requirements and applicants qualifications without regard to race, color, religion, gender, national origin, disability status, protected veteran status, genetic information and testing, family and medical leave, sexual orientation, gender identity or expression or any other status protected by law. We prohibit retaliation against individuals who bring forth any complaint, orally or in writing, to the employer, or against any individuals who assist or participate in the investigation of any complaint.

#LI-REMOTE
#LI-MG1

Never Miss a New Opportunity

Subscribe and get the latest jobs directly to your inbox

Get a

email of new

jobs

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Related Jobs

Data Scientist

ECU Health • Full-time • Remote (Greenville, North Carolina, United States) • $79.66k - $131.46k / year • 2m ago

Data Science

2m ago

Apply