Data Engineer I (Remote)
Job Summary:
Permuta Technologies, Inc. is a dynamic and innovative technology company, deeply committed to delivering specialized software solutions primarily for military and federal civilian agencies. At the heart of Permuta's mission lies a dedication to operational excellence and organizational readiness. Permuta software provides DoD and approved civilian organizations the low code/no code SaaS/AI solution that ingests existing data sources, regardless of location, to provide a single pane of glass that informs leaders to make readiness decisions which will help our forces be stronger, safer, and our country more competitive. We specialize in creating modular, user-friendly applications that address unique challenges in workforce management, talent management, readiness, and training management. Our values are centered on customer-centricity, innovation, and the effective harnessing of cutting-edge technology to serve the specific needs of government agencies.
We are seeking a mission-driven Data Engineer to support the development of a next-generation AI/ML product built on Azure Synapse Analytics, Azure ML Studio, Azure Data Factory, and Azure AI Foundry. This role focuses on building scalable data pipelines, engineering prompts for language models, and transforming structured and unstructured data to support intelligent, secure, and responsible AI solutions for U.S. government customers. The Data Engineer will also contribute to data governance execution, testing strategies, and MLOps workflows under the guidance of senior team members.
Duties/Responsibilities:
- Develop and maintain Synapse Analytics and Data Factory pipelines for training and evaluation (T&E) of data assets.
- Engineer prompts and context windows for language models using Azure OpenAI, including embedding strategies and retrieval pipelines.
- Ingest data from legacy systems and external APIs into Synapse Analytics.
- Validate, transform, and harmonize data across multiple sources.
- Collaborate with data scientists to prepare datasets for model training and evaluation.
- Document data flows, schemas, and prompt engineering strategies.
- Contribute to secure, scalable, and compliant data architecture development.
- Performs other duties as assigned.
Education and Experience:
- Bachelor’s degree in computer science, certificates, or equivalent work experience.
- At least 0-2 years of related experience required.
- Proficiency in Python, PySpark, Spark SQL, and data transformation techniques.
- Familiarity with Azure Synapse Analytics, Azure ML Studio, Azure Data Factory, and Azure OpenAI.
- Basic understanding of prompt engineering and language model context management.
- Strong analytical and problem-solving skills.
- Ability to work independently and collaboratively in a fast-paced environment.
- Excellent written and verbal communication skills.
Security Requirements:
- Must be a U.S. Citizen and live in the United States.
- Must be able to obtain a U.S. Government Security Clearance (current clearance preferred)
Desired Knowledge:
- Experience with vector databases and retrieval-augmented generation (RAG).
- Exposure to data governance and compliance frameworks (e.g., DoD, NIST).
- Familiarity with Azure DevOps and CI/CD pipelines.
- Knowledge of data visualization tools (e.g., Power BI).
- Experience working in agile development environments.
www.permuta.com
The annual compensation range for this full-time position is $62,000 to $90,000. The final base pay offered to the successful candidate will be determined by factors including, work location, as well as individual qualifications, such as job-related skills, experience, and relevant education or training.
Job Type: Full-time
Pay: $62,000.00 - $90,000.00 per year
Benefits:
- 401(k)
- 401(k) matching
- Dental insurance
- Employee assistance program
- Flexible spending account
- Health insurance
- Health savings account
- Life insurance
- Paid time off
- Parental leave
- Vision insurance
Application Question(s):
- Are you a US Citizen
- Do you have a valid Security Clearance
- Do you have Retrieval-Augmented Generation (RAG) experience?
- Do you know how to read and write in python?
Education:
Experience:
- Scikit-learn: 1 year (Required)
- Python: 1 year (Required)
Security clearance:
Work Location: Remote