Senior Data Engineer

DeepScribe • Full-time • San Francisco, CA, US • $110k - $210k / year • 7h ago

About DeepScribe

DeepScribe is building the future of healthcare technology. Our vision goes beyond automating medical notes - we are building AI agents for providers, streamlining diverse clinical workflows such as clinical trial matching, billing, and more. By embedding AI deeply into healthcare operations, we empower clinicians to deliver exceptional care.

We’ve raised over $60 million in total funding from top-tier investors, including Index Ventures and prominent angels such as Alexandr Wang (CEO of Scale AI) and Dylan Field (CEO of Figma). Our solutions are trusted by some of the largest healthcare organizations in the country, including The US Oncology Network (the nation’s largest oncology network) and Ochsner Health (the largest healthcare system on the Gulf Coast).

About The Role

We’re looking for a Senior Data Engineer to build the data infrastructure that powers DeepScribe. You'll be responsible for ensuring our data is reliable, accessible, and secure - from deidentifying sensitive PHI for analytics to building pipelines that help our teams make data-driven decisions.

You'll work on critical data infrastructure including product analytics, business intelligence, LLM cost monitoring, and the pipelines that transform raw clinical data into insights while maintaining strict HIPAA compliance.

What You’ll Do

Design, build, and maintain data pipelines for workflows across our data ecosystem
Design and implement PHI deidentification pipelines, ensuring strict HIPAA compliance while enabling analytics
Develop data infrastructure for product analytics, enabling teams to understand user behavior, feature adoption, and product performance
Build and optimize our data warehouse (Redshift), including schema design, data modeling, and query performance tuning
Partner with product, engineering, and business teams to define data requirements and deliver self-service analytics capabilities
Build data quality monitoring, validation frameworks, and alerting to ensure data reliability
Design and implement data governance policies, access controls, and audit logging for PHI and sensitive data
Optimize data pipeline performance, cost, and reliability at scale

About You

You have 5+ years of experience building production data pipelines and infrastructure
You have strong experience with Airflow (or similar orchestration tools like Prefect, Dagster) for building complex ETL/ELT workflows, as well as experience with dbt for data transformation and modeling
You can work across the full data lifecycle - from ingestion and transformation to storage and analytics
You're proficient in Python and SQL, with a strong understanding of data modeling and database optimization
You have experience designing data models and schemas that balance performance, flexibility, and maintainability
You understand data warehousing concepts and have worked with systems like Redshift, Snowflake, or BigQuery
You care about data quality and have built monitoring, testing, and validation into your pipelines
You move really, really fast while keeping the quality bar high
You care deeply about your teammates and contribute to a collaborative, team-oriented culture

Nice to have

Experience with AWS data services (S3, Glue, Lambda, MWAA)
Experience working with PHI and understand HIPAA compliance requirements, including deidentification methods
Experience working with LLM APIs and analyzing AI model performance metrics

This is a remote position, although Bay Area residents are encouraged to work from our SF office.

Perks And Benefits

$110,000 to $210,000 annual salary

Meaningful equity stake in the company

Flexible PTO

Work from home stipend

Medical, Dental, Vision, 401K and other benefits are also offered

About The Team

At DeepScribe, we value trust, teamwork, and transparency, and we’re dedicated to promoting diversity and equity in the workforce through inclusive hiring practices. Candidates with backgrounds that are underrepresented in the technology industry are encouraged to apply.

In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required eligibility verification form upon hire.

How To Use AI During Our Hiring Process

When applying: create the first draft of your resume yourself, but it’s OK to use AI to help you polish it
While preparing: use AI to research DeepScribe, practice your answers, or prepare questions for us
During take-home assignments: feel free to use AI to help you complete your work, but be prepared to explain and take responsibility for anything that you deliver
During live interviews: no AI assistance of any kind unless we indicate otherwise. We want to see how you think, approach problems, and work through challenges in real time