About the role
The Vital real-time health data for Trials, Artificial Intelligence and a Learning Health System (VITAL) platform is a federally funded, secure, high-performance computing environment being developed in Ontario, Quebec and Alberta that will allow hospital data to be collected every 24 hours, linked, deidentified, and accessed by researchers through a unified portal. VITAL expands upon the GEMINI platform at Unity Health Toronto, which is a collaborative data and analytics platform that allows 35+ Ontario hospitals to accelerate research and quality improvement, leading to excellent hospital care.
The VITAL team is seeking an experienced Data Engineer to join this innovative network. The role of the Data Engineer will be to extract, transform and load data from source systems (EHRs, administrative databases, registries, etc.) into repositories or pipelines so that the data can be used for analytics (reporting, business intelligence, statistical modeling, machine learning, etc). The scope of this role is end-to-end, such that it includes all activities from data collection, requirements gathering and database analysis, to pipeline design and programming, to ongoing QA and maintenance of data pipelines and target data systems.
The incumbent will play an integral role liaising with various departments at Unity Health Toronto and across VITAL’s collaborating institutions and hospitals, driving operational excellence for both GEMINI and VITAL. This will be a challenging, rewarding, and fast-paced environment with interaction with clinicians, researchers, data scientists, and hospital leaders. The ideal candidate will possess exceptional understanding of health data systems and technologies, and excellent skills with inter-institutional stakeholders and initiative.
Duties And Responsibilities Requirements gathering and data analysis Engages with stakeholders and data scientists to understand their analytic/research/business needs, and work collaboratively with them to develop and validate concrete business requirements for data collection and data pipeline design
Documents Business Requirements As Required Carries out analysis of source system databases to discover the structure, flows, functions, and interdependencies of data within the system using custom written SQL scripts or an enterprise tool.
Documents findings and presents to the stakeholders and data scientists.
Data pipeline development Based on requirements and analysis findings, conducts detailed interviews with subject matter experts and data scientists to support data collection and inform the design of a logical and scalable pipeline model
Documents data pipeline model using mock-up or partial outputs as appropriate, and validates with stakeholders and data scientists
Constructs and automates the data pipeline from validated model
For data streams, utilizes HL7/FIHR or other appropriate messaging standard to create a messaging stream, listener and queuing solutions
Designs and implements robust error and exception handling procedures
Documents data pipeline architecture
Monitoring and troubleshooting of data pipelines, target data systems Routinely monitors system logs/alerts for error and exception detection
Manually runs jobs/restarts pipelines when automation fails
Maintains an ongoing quality assurance to correct for data drift
Collaborates with IT Security team to monitor and protect pipelines against data leakage and ensure compliance with healthcare data privacy policies and regulations.
Patches/updates data pipeline software, scripting tools, etc as needed
Routinely monitors data pipeline health and optimizes code when necessary
Qualifications Undergrad degree in Computer Science, Engineering, Biostatistics and/or related discipline and at least 5-7 years’ relevant experience managing large scale projects required, preferably in a healthcare setting OR demonstrable equivalent combination of specialized education and experience.
Extensive knowledge in the design and development of data pipelines required
Mastery of SQL as well as at least one of either R or Python required; working knowledge of different database systems (MS SQL Server, Oracle, Netezza etc) preferred
Experience Designing And Running Testing Scenarios Required Familiarity with cloud architecture (AWS, Azure, GCP, etc.) preferred
Working knowledge of Kafka, Storm, Spark and/or Hive preferred
Knowledge on Linux commands and Shell scripts required
Good knowledge of data warehouse concepts and experience with BI/DW concepts e.g., facts, dimension, star/snowflake schema structures, 3NF modeling, metadata management etc., required
Working knowledge of version controlling and collaboration platform (GitHub, GitLab etc) required
Experience With Containerization And Orchestration Technologies Preferred Knowledge of HL7 and FHIR standards preferred
Good judgment and understanding of what issues to escalate, resolve on your own, making suggestions for possible resolution required
Ability to learn new technology expediently
Ability to lead or participate in business meetings to capture requirements, provide status updates and document the requirements
Ability to maintain responsibility for all aspects of pipeline development, automation and improvements/operations for a project
Ability to work effectively in a team environment and across all organizational levels, where flexibility, collaboration and adaptability are important
Unity Health Toronto is committed to creating an accessible and inclusive organization. We strive to provide a recruitment process that is barrier-free and in compliance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code. We understand that you may require an accommodation at any stage of the recruitment process. When you are contacted, please inform the Talent Acquisition Specialist and we will work with you to meet your accommodation needs. We want to emphasize that all accommodation requests are handled with the utmost confidentiality, respecting your privacy and dignity.
About Unity Health Toronto
Unity Health Toronto, comprised of Providence Healthcare, St. Joseph’s Health Centre and St. Michael’s Hospital, works to advance the health of everyone in our urban communities and beyond. Our health network serves patients, residents and clients across the full spectrum of care, spanning primary care, secondary community care, tertiary and quaternary care services to post-acute through rehabilitation, palliative care and long-term care, while investing in world-class research and education.
About the role
The Vital real-time health data for Trials, Artificial Intelligence and a Learning Health System (VITAL) platform is a federally funded, secure, high-performance computing environment being developed in Ontario, Quebec and Alberta that will allow hospital data to be collected every 24 hours, linked, deidentified, and accessed by researchers through a unified portal. VITAL expands upon the GEMINI platform at Unity Health Toronto, which is a collaborative data and analytics platform that allows 35+ Ontario hospitals to accelerate research and quality improvement, leading to excellent hospital care.
The VITAL team is seeking an experienced Data Engineer to join this innovative network. The role of the Data Engineer will be to extract, transform and load data from source systems (EHRs, administrative databases, registries, etc.) into repositories or pipelines so that the data can be used for analytics (reporting, business intelligence, statistical modeling, machine learning, etc). The scope of this role is end-to-end, such that it includes all activities from data collection, requirements gathering and database analysis, to pipeline design and programming, to ongoing QA and maintenance of data pipelines and target data systems.
The incumbent will play an integral role liaising with various departments at Unity Health Toronto and across VITAL’s collaborating institutions and hospitals, driving operational excellence for both GEMINI and VITAL. This will be a challenging, rewarding, and fast-paced environment with interaction with clinicians, researchers, data scientists, and hospital leaders. The ideal candidate will possess exceptional understanding of health data systems and technologies, and excellent skills with inter-institutional stakeholders and initiative.
Duties And Responsibilities Requirements gathering and data analysis Engages with stakeholders and data scientists to understand their analytic/research/business needs, and work collaboratively with them to develop and validate concrete business requirements for data collection and data pipeline design
Documents Business Requirements As Required Carries out analysis of source system databases to discover the structure, flows, functions, and interdependencies of data within the system using custom written SQL scripts or an enterprise tool.
Documents findings and presents to the stakeholders and data scientists.
Data pipeline development Based on requirements and analysis findings, conducts detailed interviews with subject matter experts and data scientists to support data collection and inform the design of a logical and scalable pipeline model
Documents data pipeline model using mock-up or partial outputs as appropriate, and validates with stakeholders and data scientists
Constructs and automates the data pipeline from validated model
For data streams, utilizes HL7/FIHR or other appropriate messaging standard to create a messaging stream, listener and queuing solutions
Designs and implements robust error and exception handling procedures
Documents data pipeline architecture
Monitoring and troubleshooting of data pipelines, target data systems Routinely monitors system logs/alerts for error and exception detection
Manually runs jobs/restarts pipelines when automation fails
Maintains an ongoing quality assurance to correct for data drift
Collaborates with IT Security team to monitor and protect pipelines against data leakage and ensure compliance with healthcare data privacy policies and regulations.
Patches/updates data pipeline software, scripting tools, etc as needed
Routinely monitors data pipeline health and optimizes code when necessary
Qualifications Undergrad degree in Computer Science, Engineering, Biostatistics and/or related discipline and at least 5-7 years’ relevant experience managing large scale projects required, preferably in a healthcare setting OR demonstrable equivalent combination of specialized education and experience.
Extensive knowledge in the design and development of data pipelines required
Mastery of SQL as well as at least one of either R or Python required; working knowledge of different database systems (MS SQL Server, Oracle, Netezza etc) preferred
Experience Designing And Running Testing Scenarios Required Familiarity with cloud architecture (AWS, Azure, GCP, etc.) preferred
Working knowledge of Kafka, Storm, Spark and/or Hive preferred
Knowledge on Linux commands and Shell scripts required
Good knowledge of data warehouse concepts and experience with BI/DW concepts e.g., facts, dimension, star/snowflake schema structures, 3NF modeling, metadata management etc., required
Working knowledge of version controlling and collaboration platform (GitHub, GitLab etc) required
Experience With Containerization And Orchestration Technologies Preferred Knowledge of HL7 and FHIR standards preferred
Good judgment and understanding of what issues to escalate, resolve on your own, making suggestions for possible resolution required
Ability to learn new technology expediently
Ability to lead or participate in business meetings to capture requirements, provide status updates and document the requirements
Ability to maintain responsibility for all aspects of pipeline development, automation and improvements/operations for a project
Ability to work effectively in a team environment and across all organizational levels, where flexibility, collaboration and adaptability are important
Unity Health Toronto is committed to creating an accessible and inclusive organization. We strive to provide a recruitment process that is barrier-free and in compliance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code. We understand that you may require an accommodation at any stage of the recruitment process. When you are contacted, please inform the Talent Acquisition Specialist and we will work with you to meet your accommodation needs. We want to emphasize that all accommodation requests are handled with the utmost confidentiality, respecting your privacy and dignity.
About Unity Health Toronto
Unity Health Toronto, comprised of Providence Healthcare, St. Joseph’s Health Centre and St. Michael’s Hospital, works to advance the health of everyone in our urban communities and beyond. Our health network serves patients, residents and clients across the full spectrum of care, spanning primary care, secondary community care, tertiary and quaternary care services to post-acute through rehabilitation, palliative care and long-term care, while investing in world-class research and education.