Top Benefits
About the role
Who you are
- Experience building production ML infrastructure with strong systems fundamentals
- Hands-on work with agentic systems, multi-agent workflows, or agent development frameworks
- Familiarity with model routing and LLM provider frameworks across different model types and environments
- Experience with scalable, fault-tolerant distributed systems and Kubernetes
- A track record of moving quickly on prototypes and making good decisions about productionization
- Experience across on-prem, private cloud, and public cloud environments
- Familiarity with storage systems, embedded databases, or filesystem abstractions
- Experience with code execution sandboxes such as gVisor, Firecracker, Kata, or WASM runtimes
- Interest in emerging ML infrastructure, edge inference, or browser-native models
- Open-source contributions to LLM or agent infrastructure projects
- Experience with identity, workload auth, or capability-based security systems
- If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply!
What the job involves
- We’re building the next generation of agentic AI infrastructure at Cohere
- This team sits at the intersection of ML systems, distributed infrastructure, and developer experience, creating the platform that powers autonomous AI agents at scale
- You’ll work on hard, forward-looking problems with few established patterns, including secure code execution, agent state management, model routing, identity and authentication, and resource management for long-running agent workflows
- This role is a strong fit for someone who combines systems depth with ML intuition
- You should be comfortable building reliable infrastructure, thinking through distributed systems tradeoffs, and understanding how emerging agentic capabilities shape platform design
- Secure execution environments for agent-generated code
- Identity, authentication, and trust boundaries for agents
- Model routing and orchestration across different model types and environments
- Rate limiting, quotas, and resource management for agent workflows
- State management, memory, and filesystem abstractions for agents
- Turn emerging ML research ideas into production-ready infrastructure
- Build core platform capabilities for execution, storage, and state management
- Prototype and evaluate new technologies, then help decide what should move into production
- Partner with research teams to shape infrastructure based on what future agent systems will need
Benefits
- Six weeks’ paid vacation
- Equity / stock options
- RRSP, 401(k), and Pension Scheme contributions
- Coverage for 100% of your insurance premiums across health, dental, vision, and travel
- Additional coverage for accessing mental health providers/services
- Six months of fully paid parental leave, including adoption and surrogacy
- Financial support for egg freezing and IVF in Canada and the UK
- A monthly fitness and wellness allowance
- Globally dispersed company that supports a remote work culture
- A $2,000 annual education benefit for professional development
- A weekly stipend for meals when working remotely and catered lunch when working from one of our global offices
- A monthly arts and culture allowance
- A monthly quality time allowance
Not the right fit? Search for Software Engineer jobs in Toronto
About Cohere
Cohere is the leading data security-focused enterprise AI company. It is a global technology company co-headquartered in Toronto and San Francisco, with key offices in London and New York. The company builds enterprise-grade frontier AI models with industry-leading multilingual capabilities designed to solve real-world business challenges. Cohere’s AI solutions are cloud-agnostic to meet companies wherever their data is stored and offer the highest levels of security, privacy, and customization with on-premises and private cloud deployment options.
Similar jobs you might like
Top Benefits
About the role
Who you are
- Experience building production ML infrastructure with strong systems fundamentals
- Hands-on work with agentic systems, multi-agent workflows, or agent development frameworks
- Familiarity with model routing and LLM provider frameworks across different model types and environments
- Experience with scalable, fault-tolerant distributed systems and Kubernetes
- A track record of moving quickly on prototypes and making good decisions about productionization
- Experience across on-prem, private cloud, and public cloud environments
- Familiarity with storage systems, embedded databases, or filesystem abstractions
- Experience with code execution sandboxes such as gVisor, Firecracker, Kata, or WASM runtimes
- Interest in emerging ML infrastructure, edge inference, or browser-native models
- Open-source contributions to LLM or agent infrastructure projects
- Experience with identity, workload auth, or capability-based security systems
- If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply!
What the job involves
- We’re building the next generation of agentic AI infrastructure at Cohere
- This team sits at the intersection of ML systems, distributed infrastructure, and developer experience, creating the platform that powers autonomous AI agents at scale
- You’ll work on hard, forward-looking problems with few established patterns, including secure code execution, agent state management, model routing, identity and authentication, and resource management for long-running agent workflows
- This role is a strong fit for someone who combines systems depth with ML intuition
- You should be comfortable building reliable infrastructure, thinking through distributed systems tradeoffs, and understanding how emerging agentic capabilities shape platform design
- Secure execution environments for agent-generated code
- Identity, authentication, and trust boundaries for agents
- Model routing and orchestration across different model types and environments
- Rate limiting, quotas, and resource management for agent workflows
- State management, memory, and filesystem abstractions for agents
- Turn emerging ML research ideas into production-ready infrastructure
- Build core platform capabilities for execution, storage, and state management
- Prototype and evaluate new technologies, then help decide what should move into production
- Partner with research teams to shape infrastructure based on what future agent systems will need
Benefits
- Six weeks’ paid vacation
- Equity / stock options
- RRSP, 401(k), and Pension Scheme contributions
- Coverage for 100% of your insurance premiums across health, dental, vision, and travel
- Additional coverage for accessing mental health providers/services
- Six months of fully paid parental leave, including adoption and surrogacy
- Financial support for egg freezing and IVF in Canada and the UK
- A monthly fitness and wellness allowance
- Globally dispersed company that supports a remote work culture
- A $2,000 annual education benefit for professional development
- A weekly stipend for meals when working remotely and catered lunch when working from one of our global offices
- A monthly arts and culture allowance
- A monthly quality time allowance
Not the right fit? Search for Software Engineer jobs in Toronto
About Cohere
Cohere is the leading data security-focused enterprise AI company. It is a global technology company co-headquartered in Toronto and San Francisco, with key offices in London and New York. The company builds enterprise-grade frontier AI models with industry-leading multilingual capabilities designed to solve real-world business challenges. Cohere’s AI solutions are cloud-agnostic to meet companies wherever their data is stored and offer the highest levels of security, privacy, and customization with on-premises and private cloud deployment options.