Principal Software Engineer - Data Platform

CanCap Group Inc.8 days ago

Hybrid

Toronto, ON

Staff

Top Benefits

Health and Dental Benefits

Hybrid model, requiring employees to work 50% in-office, with flexibility to work remotely or from the office on other days

A passionate team dedicated to supporting and empowering others

About the role

CanCap Group Inc. is part of privately-owned Canadian national financial services company with multiple verticals across automotive, consumer, and merchant lending portfolios. We manage the entire lifecycle of the finance receivable from credit adjudication through to contract administration, customer service, default management and post charge-off recoveries. We are a company of innovators, we learn from each other, respect each other and create together. When it comes to our customers, partners, and each other, we are always motivated by doing the “right thing”. We are always looking to find the best people and the right methods that allow us to meet this goal and look to the future for growth.

What Your Day and Week Could Look Like

Reporting to the Head of Data Platform, the Principal Software Engineer will be responsible for architecting & building data platform capabilities and integration services including utility software bundles design, development, AI workflows and high performing ETL pipelines running on Databricks environment with DBT layer on top.

The ideal candidate will have a strong engineering and architecture background in building distributed systems, scalable data pipelines, transforming complex datasets, and enabling robust data analytics and business intelligence. You will work closely with cross-functional teams including data scientists, senior architects, product leaders, and other domain senior engineers to deliver reliable and high-quality data solutions.

Key Responsibilities

Architect, design and develop scalable, secure, and high-performance data platforms using technologies like Databricks, Apache Spark, Kafka, Delta Lake, and cloud-native services (GCP/AWS/Azure).
Own the end-to-end architecture for data ingestion, transformation (ETL/ELT), governance, and storage across structured, semi structured and unstructured datasets.
Drive engineering best practices, including CI/CD, testing, performance optimization, and observability.
Collaborate with cross-functional teams to translate business requirements into technical solutions that enable advanced analytics and AI-driven decision-making.
Mentor and guide senior engineers and influence engineering culture and technical direction across the organization.
Ensure high standards of data quality, security, and compliance, including role-based access controls, auditability, and lineage.
Contribute to the evaluation and adoption of emerging technologies and frameworks in the AI and data platform ecosystem.
Develop open-source technologies based backend systems to support Data Platform and AI needs.
Implement streaming and batch processing architectures to support analytics and AI workloads.
Design, build, and optimize scalable and reliable data pipelines using Python, Databricks workflows and related ETL frameworks.
Lead the development of AI/ML platform capabilities, including model training pipelines, feature stores, model serving, and monitoring frameworks.
Develop and deploy modular AI/ML components that integrate with the data platform using frameworks like PyTorch, TensorFlow, Hugging Face Transformers, etc.
Work with MLOps tools (e.g., MLflow, Vertex AI, SageMaker) to manage model lifecycle, reproducibility, and deployment pipelines.
Integrate third-party AI services (e.g., OpenAI) into internal applications and decision engines.
Develop and maintain robust data models and transformations using DBT (Data Build Tool)
Manage and orchestrate data workflows on Google Cloud Platform (GCP) using DBT, Databricks, Dataflow, Cloud Composer, and Cloud Storage.
Collaborate with stakeholders to understand data needs and provide clean, structured data for analytics and reporting.
Implement data quality, validation and governance practices to ensure trust in the data platform.
Mentor other senior engineers and lead the design and code reviews.
Monitor and troubleshoot data pipeline issues, ensuring performance and reliability.
Troubleshoot and resolve Databricks platform issues, identifying root causes of system performance bottlenecks.

What You Bring

10+ years of software architecture, development, data engineering or building scalable Data/AI platforms.
5+ years of experience on data platform architecture and distributed system design.
4+ years of Lakehouse based design pattern and related technologies experience.
Strong experience with Databricks and Spark-based data processing.
Proficient in Python for data manipulation, scripting, and automation.
Proven experience building AI/ML platforms or integrating ML models into production-grade systems.
Expertise in Python, Scala, or Java, and familiarity with SQL and data modeling principles.
Solid understanding of cloud-native data services (e.g., GCP BigQuery, AWS S3/Glue, Azure Synapse).
Experience with DevSecOps best practices: containerization (Docker/Kubernetes), infrastructure-as-code (Terraform), CI/CD pipelines and code scanning
Strong communication skills and ability to influence multiple levels of the organization.
Strong hands-on experience with Python, SQL and Java, Scala or Go.
Hands-on experience with ETL frameworks and best practices.
Solid understanding and practical experience with dbt for data modeling and transformation.
Deep knowledge of Google Cloud Platform (GCP), especially BigQuery, Dataflow, and Cloud Storage.
Experience working with structured, semi-structured and unstructured data (e.g., JSON, Parquet, Avro).
Experience on open table format like apache hudi, iceberg and delta.
Solid understanding of data architecture patterns (e.g., Lakehouse, medallion architecture, CDC, data mesh etc.).
Experience with AI/ML frameworks and integrating LLMs or generative AI tools via APIs.
Experience working with open-source tools and dev communities.
Excellent problem-solving, communication, and collaboration skills
Experience working with large-scale data processing, data structures, and algorithms.
Experience with data orchestration tools like Apache Airflow or Cloud Composer.
Familiarity with CI/CD pipelines and version control (e.g., Git).
Background in working with large-scale distributed systems and performance tuning.
Knowledge of data governance and security best practices in cloud environments.

Preferred Qualifications

Experience with streaming platforms (e.g., Kafka, Pub/Sub, Flink) and real-time analytics.
Familiarity with AI/ML lifecycle tools like MLflow, Kubeflow, Vertex AI, or SageMaker.
Deep knowledge of data governance, cataloging, lineage, and compliance frameworks.
Contributions to open-source data or AI/ML frameworks is a plus.
GCP certification (e.g., Data Engineer, Cloud Developer).
Certifications in Databricks (e.g., Databricks Certified Data Engineer).
Experience with multi-cloud environments.
Expertise in software engineering best practices and ML platform development.
Understanding of networking concepts (VPC, firewall rules, peering, etc.) in GCP.
Experience in FinTech, HealthTech, or regulated industries.
Exposure to Kubernetes, Docker, and distributed systems engineering.

Nice to Have

Master's degree in Computer Science or related technical fields.
Proficiency in performance tuning, large-scale data analysis and debugging skills.

What You Can Expect From Us

Our Employee Experience is aimed at supporting and inspiring our talented team through:

A passionate team dedicated to supporting and empowering others.
An environment where creative, innovative thinking is encouraged.
Health and Dental Benefits.

Work Location & Remote Flexibility

This role follows a hybrid model, requiring employees to work 50% in-office, with flexibility to work remotely or from the office on other days.
The company has two office locations:
- Downtown Toronto (Church Street) – The tech team is primarily based here.
- Mississauga – Another office location, but less frequently used by the tech team.

CanCap is an equal opportunity employer and values diversity. We are committed to building and evolving a team reflecting a variety of backgrounds, perspectives, and skills. To be considered for employment, you will need to successfully pass a criminal background check and validate your work experience.

Next Steps

Adding to our team is an important step in our business. We’ve taken time to be purposeful and thoughtful with this job posting, and we encourage you to do the same with your application. Help us understand how your experience aligns with this role and how you can contribute to our Databricks-driven data platform.

About CanCap Group Inc.

201-500

We manage the entire lifecycle of the finance receivable from credit adjudication through to contract administration, customer service, default management and post charge-off recoveries. We are a company of innovators: we learn from each other, respect each other, and create together. We strive to inspire our customers by continually understanding them, meeting their needs, and keeping them happily surprised. And we always do so with integrity.

Nous gérons tout un cycle de vie de la créance financière, de l'adjudication de crédit à l'administration des contrats, au service à la clientèle, à la gestion des défauts et aux recouvrements après imputation. Nous sommes une entreprise d'innovateurs: nous apprenons mutuellement, nous nous respectons et créons ensemble. Nous nous efforçons d'inspirer nos clients en les écoutant, en répondant à leurs besoins et en les gardant agréablement surpris. Et nous le faisons toujours avec intégrité.

Website LinkedIn