Python - Spark Developer

CGI about 1 month ago

Montréal, QC

Mid Level

Full-Time

About the role

Mission
Your mission is to design and build scalable data platforms and services using Python and Spark, delivering Data-as-a-Service and Infrastructure-as-a-Service capabilities to internal and external clients. You will contribute to modern data pipelines, cloud-native deployments, and API-driven data access in an Azure ecosystem.

Day-to-Day Responsibilities

Design, develop, and maintain data pipelines and RESTful APIs using Python (3.8+) and Apache Spark.
Build and optimize distributed data processing workflows (batch & streaming) for large-scale datasets.
Contribute to data platform architecture leveraging Azure services (e.g., Data Lake, Databricks, AKS).
Follow Git-based workflows (GitHub/GitLab) and enforce versioning and code quality standards.
Work within Agile methodologies (Scrum/Kanban) using Jira.
Deploy applications and data pipelines using CI/CD pipelines (Jenkins/Azure DevOps).
Containerize and deploy services on Kubernetes (AKS) ensuring scalability and reliability.
Collaborate with stakeholders to analyze business requirements and translate them into data solutions.
Communicate effectively with clients and internal teams; synthesize and present findings clearly.
Partner with Ops/DevOps teams to ensure production readiness, monitoring, and reliability.
Ensure compliance with data governance, security, and best engineering practices.
Continuously improve platform performance, cost efficiency, and maintainability.

Technical Skills

Strong expertise in:

Python (3.8+) with focus on data engineering and backend development (3+ years)
Apache Spark / PySpark for distributed data processing (2+ years)
SQL & NoSQL databases (data modeling, optimization)
Git (GitHub/GitLab) and collaborative development workflows
CI/CD tools (Jenkins, Azure DevOps) and automation (1+ year)
Object-Oriented Programming and clean architecture principles

Proficient in:

Azure ecosystem (e.g., Azure Data Lake, Databricks, AKS, Functions)
Data pipeline orchestration (Airflow, Azure Data Factory, or equivalent)
Containerization (Docker) and orchestration (Kubernetes / AKS)
RESTful API development and integration
Agile methodologies (Scrum/Kanban), TDD, and unit testing
UNIX/Linux environments and best practices

Desired / Plus

Experience with real-time data streaming (Kafka, Spark Streaming, Event Hub)
Knowledge of data governance, data quality, and lineage tools
Familiarity with Flask / FastAPI / OpenAPI for data service exposure
Infrastructure-as-Code (Terraform, ARM templates, Bicep)
Monitoring tools (Prometheus, Grafana, Azure Monitor)
Strong documentation and presentation skills

Competencies

Strong client-focused mindset with a data-driven approach to problem solving
Ability to work across data, development, and operations teams
Excellent collaboration skills in a global and cross-functional environment
Strong analytical thinking with attention to detail and performance optimization
Ability to clearly communicate complex data concepts to technical and non-technical audiences
Proactive mindset with continuous improvement orientation

Experience Needed

Minimum 3+ years in Data Engineering / Big Data development
Hands-on experience with Python and Spark in production environments
Experience with cloud platforms (preferably Azure) and containerized deployments
Familiarity with DevOps and ITIL processes is a plus
Ability to quickly adapt to new technologies and environments

Educational Requirements

Master’s Degree in Engineering, Computer Science, or related field

Certifications (Nice to Have)

Azure Data Engineer Associate (DP-203) or equivalent
Databricks / Spark certifications
Agile certifications (Scrum, SAFe)

Languages

Fully bilingual: English and French

==================================

Développeur Big Data & Data Engineering (Python / Spark / Azure)

Mission
Votre mission est de concevoir et développer des plateformes de données scalables en utilisant Python et Spark, afin de fournir des capacités de Data-as-a-Service et Infrastructure-as-a-Service aux clients internes et externes. Vous contribuerez à la mise en place de pipelines de données modernes, de déploiements cloud natifs et d’APIs d’accès aux données dans un environnement Azure.

Responsabilités quotidiennes

Concevoir, développer et maintenir des pipelines de données et APIs REST en Python (3.8+) et Apache Spark.
Développer et optimiser des traitements distribués (batch et streaming) pour des volumes de données importants.
Contribuer à l’architecture de la plateforme data en s’appuyant sur les services Azure (Data Lake, Databricks, AKS, etc.).
Appliquer les bonnes pratiques de développement via des workflows Git (GitHub/GitLab).
Travailler en méthodologie Agile (Scrum/Kanban) avec Jira.
Déployer les solutions via des pipelines CI/CD (Jenkins, Azure DevOps).
Conteneuriser et déployer les applications sur Kubernetes (AKS) en garantissant la scalabilité et la résilience.
Collaborer avec les parties prenantes pour analyser et clarifier les besoins métier.
Communiquer efficacement avec les équipes et les clients, et synthétiser les retours.
Travailler étroitement avec les équipes Ops/DevOps pour assurer la mise en production et la supervision.
Garantir le respect des bonnes pratiques (sécurité, gouvernance des données, architecture).
Améliorer en continu les performances, la maintenabilité et les coûts des solutions.

Compétences techniques

Expertise solide en :

Python (3.8+) orienté data engineering et développement backend (3+ ans)
Apache Spark / PySpark pour le traitement distribué (2+ ans)
SQL & NoSQL (modélisation et optimisation des données)
Git (GitHub/GitLab) et workflows collaboratifs
Outils CI/CD (Jenkins, Azure DevOps) (1+ an)
Programmation orientée objet et bonnes pratiques de conception

Maîtrise de :

Environnement Azure (Data Lake, Databricks, AKS, Functions, etc.)
Orchestration de pipelines (Airflow, Azure Data Factory ou équivalent)
Conteneurisation (Docker) et orchestration (Kubernetes / AKS)
Développement et intégration d’APIs REST
Méthodologies Agile (Scrum/Kanban), TDD, tests unitaires
Environnements UNIX/Linux et bonnes pratiques associées

Atouts (Nice to Have)

Expérience en streaming temps réel (Kafka, Spark Streaming, Event Hub)
Connaissances en gouvernance des données, qualité des données et data lineage
Maîtrise de Flask / FastAPI / OpenAPI
Infrastructure-as-Code (Terraform, ARM, Bicep)
Outils de monitoring (Prometheus, Grafana, Azure Monitor)
Excellentes compétences en documentation et présentation

Compétences comportementales

Forte orientation client et qualité de service
Capacité à collaborer avec des équipes data, développement et opérations
Excellentes aptitudes relationnelles dans un environnement international
Esprit analytique avec souci du détail et de la performance
Capacité à vulgariser des concepts techniques complexes
Proactivité et démarche d’amélioration continue

Expérience requise

Minimum 3 ans d’expérience en Data Engineering / Big Data
Expérience concrète avec Python et Spark en production
Expérience sur des environnements cloud (Azure de préférence) et conteneurisés
Connaissance des processus DevOps / ITIL est un plus
Capacité d’adaptation rapide à de nouveaux environnements techniques

Formation

Diplôme d’ingénieur ou Master en informatique, data ou domaine connexe

Certifications (souhaitées)

Microsoft Azure Data Engineer Associate (DP-203)
Certifications Databricks / Spark
Certifications Agile (Scrum, SAFe)

Langues

Bilingue : français et anglais

Not the right fit? Search for Python jobs in Montréal, QC

About CGI

IT Services and IT Consulting

10,000+

Insights you can act on to achieve trusted outcomes.

We are insights-driven and outcomes-focused to help accelerate returns on your investments. Across 21 industry sectors and 400 locations worldwide, we provide comprehensive, scalable and sustainable IT and business consulting services that are informed globally and delivered locally.

We value your opinions and welcome your comments and questions on our posts here on LinkedIn. Please keep a polite, professional and constructive tone. We remove comments containing objectionable language and derogatory views. We do not allow content that is unrelated to the subject, and we remove discriminatory and racist comments as well as spam and advertising.

Note that content on this page contains general information regarding CGI’s services and initiatives and should not be considered direct business advice. To engage in a discussion with one of our experts, please make a request through https://www.cgi.com/en/contact-us

Website LinkedIn

Similar jobs you might like