About the role
As a Senior Data Engineer at Clario, you will play a critical role in designing and building the modern data infrastructure that powers advanced analytics, machine learning, and AI‑driven innovation across our clinical technology platform. You will architect cloud‑native, scalable, and secure data systems that support regulated clinical environments, ensuring data flows are reliable, compliant, and optimized for next‑generation clinical insights. Partnering closely with data scientists, AI engineers, software engineers, and product teams, you will help evolve a data ecosystem capable of supporting large‑scale clinical datasets, imaging studies, and AI‑enabled applications.
What You’ll Be Doing
- Design, build, and maintain scalable ETL/ELT pipelines for structured and unstructured clinical data
- Develop and optimize data models supporting analytics, reporting, and machine learning workflows
- Build and maintain cloud‑native data architectures within AWS environments
- Develop pipelines that support AI and machine learning model development and deployment
- Operationalize and productionize machine learning models developed by Data Science teams
- Ensure data quality, integrity, governance, and regulatory compliance
- Improve performance, reliability, and scalability of large‑scale data platforms
- Collaborate closely with data scientists, AI engineers, software engineers, and product teams
- Translate clinical and business requirements into scalable data engineering solutions
- Implement monitoring, observability, and automated validation across data pipelines
- Contribute to data engineering standards, architecture design, and platform evolution
What We Look For
- Bachelor’s degree in Computer Science, Engineering, Mathematics, or related quantitative field
- 5+ years of experience in data engineering or data platform development
- Strong proficiency in Python and SQL
- Experience designing and maintaining scalable data pipelines in cloud environments
- Hands-on experience with AWS services such as S3, Redshift, Glue, Lambda, EMR, or similar
- Strong understanding of data modeling, schema design, and performance optimization
- Experience supporting machine learning or AI workflows in production environments
- Experience working with distributed or large-scale data architectures
- Strong analytical, problem-solving, and communication skills
- Experience in regulated industries (healthcare, life sciences, clinical research) is a plus
Preferred Experience
- Experience with AI/ML data pipelines or generative AI workflows
- Experience handling large-scale or high-volume datasets
- Experience working with medical imaging data or complex healthcare data structures
About Clario
Clario is a leading healthcare research and technology company that generates high quality clinical evidence for our pharmaceutical, biotech, and medical device partners. We offer comprehensive evidence generation solutions that combine eCOA, cardiac safety, medical imaging, precision motion, and respiratory endpoints.
Since our founding more than 50 years ago, Clario has delivered deep scientific expertise and broad endpoint technologies to help transform lives around the world. Our endpoint data solutions have supported clinical trials over 26,000 times in more than 100 countries. Our global team of science, technology, and operational experts have supported over 60% of all FDA drug approvals since 2019.
For more information, visit Clario.com
Similar jobs you might like
About the role
As a Senior Data Engineer at Clario, you will play a critical role in designing and building the modern data infrastructure that powers advanced analytics, machine learning, and AI‑driven innovation across our clinical technology platform. You will architect cloud‑native, scalable, and secure data systems that support regulated clinical environments, ensuring data flows are reliable, compliant, and optimized for next‑generation clinical insights. Partnering closely with data scientists, AI engineers, software engineers, and product teams, you will help evolve a data ecosystem capable of supporting large‑scale clinical datasets, imaging studies, and AI‑enabled applications.
What You’ll Be Doing
- Design, build, and maintain scalable ETL/ELT pipelines for structured and unstructured clinical data
- Develop and optimize data models supporting analytics, reporting, and machine learning workflows
- Build and maintain cloud‑native data architectures within AWS environments
- Develop pipelines that support AI and machine learning model development and deployment
- Operationalize and productionize machine learning models developed by Data Science teams
- Ensure data quality, integrity, governance, and regulatory compliance
- Improve performance, reliability, and scalability of large‑scale data platforms
- Collaborate closely with data scientists, AI engineers, software engineers, and product teams
- Translate clinical and business requirements into scalable data engineering solutions
- Implement monitoring, observability, and automated validation across data pipelines
- Contribute to data engineering standards, architecture design, and platform evolution
What We Look For
- Bachelor’s degree in Computer Science, Engineering, Mathematics, or related quantitative field
- 5+ years of experience in data engineering or data platform development
- Strong proficiency in Python and SQL
- Experience designing and maintaining scalable data pipelines in cloud environments
- Hands-on experience with AWS services such as S3, Redshift, Glue, Lambda, EMR, or similar
- Strong understanding of data modeling, schema design, and performance optimization
- Experience supporting machine learning or AI workflows in production environments
- Experience working with distributed or large-scale data architectures
- Strong analytical, problem-solving, and communication skills
- Experience in regulated industries (healthcare, life sciences, clinical research) is a plus
Preferred Experience
- Experience with AI/ML data pipelines or generative AI workflows
- Experience handling large-scale or high-volume datasets
- Experience working with medical imaging data or complex healthcare data structures
About Clario
Clario is a leading healthcare research and technology company that generates high quality clinical evidence for our pharmaceutical, biotech, and medical device partners. We offer comprehensive evidence generation solutions that combine eCOA, cardiac safety, medical imaging, precision motion, and respiratory endpoints.
Since our founding more than 50 years ago, Clario has delivered deep scientific expertise and broad endpoint technologies to help transform lives around the world. Our endpoint data solutions have supported clinical trials over 26,000 times in more than 100 countries. Our global team of science, technology, and operational experts have supported over 60% of all FDA drug approvals since 2019.
For more information, visit Clario.com