About the role
Jobright is a next-generation AI job search platform built to make career navigation faster, smarter, and more personal. They are looking for a Site Reliability Engineer to keep the systems behind our AI agents fast, resilient, and ready to scale as millions of job seekers depend on them every day.
Why Join Us
Own the infrastructure that keeps real-time AI agents running reliably for users making important career decisions Tackle problems unique to LLM-powered systems, from inference latency and cost optimization to handling unpredictable traffic spikes Work with engineers who treat reliability as a product feature, not a clean-up job that happens after the fact Join a team where automation, observability, and thoughtful on-call practices are first-class investments
Responsibilities
Design, build, and maintain the cloud infrastructure that powers Jobright's AI agents, APIs, and user-facing services Improve system observability through metrics, logging, and tracing, making it easier for the whole team to understand what's happening in production Partner with product and engineering teammates to harden new features before launch, owning capacity planning, performance testing, and rollout strategies Lead incident response when things go wrong, run blameless post-mortems, and turn each incident into durable improvements in reliability and tooling
Qualifications
Required
Early to mid-career engineer with 1 to 3 years of experience in site reliability, DevOps, platform, or backend engineering Strong communicator who can break down complex infrastructure tradeoffs for engineers, product partners, and leadership alike Solid grounding in cloud platforms, containerization, CI/CD pipelines, and the fundamentals of distributed systems
Preferred
Prior experience supporting production AI/ML workloads or high-throughput API services at a tech or AI-focused organization Demonstrated comfort operating in fast-moving environments where on-call coverage, incident response, and infrastructure changes happen in parallel Hands-on skills in AWS or GCP, Kubernetes, Terraform, monitoring stacks like Datadog or Prometheus, and scripting in Python or Go
Not the right fit? Search for Site Reliability Engineer jobs in Canada
About Jobright.ai
Jobright is the first AI-native hiring platform that connects top talent with great employers faster than ever before.
ππ¨π π¬πππ€ππ«π¬: Your AI job search agent that turns a solo, time-consuming hunt into a fast, expert-guided path to landing your ideal job.
ππ¦π©π₯π¨π²ππ«π¬: Your AI recruiting partner that delivers only active, qualified candidates with speed and precision.
Similar Jobs
About the role
Jobright is a next-generation AI job search platform built to make career navigation faster, smarter, and more personal. They are looking for a Site Reliability Engineer to keep the systems behind our AI agents fast, resilient, and ready to scale as millions of job seekers depend on them every day.
Why Join Us
Own the infrastructure that keeps real-time AI agents running reliably for users making important career decisions Tackle problems unique to LLM-powered systems, from inference latency and cost optimization to handling unpredictable traffic spikes Work with engineers who treat reliability as a product feature, not a clean-up job that happens after the fact Join a team where automation, observability, and thoughtful on-call practices are first-class investments
Responsibilities
Design, build, and maintain the cloud infrastructure that powers Jobright's AI agents, APIs, and user-facing services Improve system observability through metrics, logging, and tracing, making it easier for the whole team to understand what's happening in production Partner with product and engineering teammates to harden new features before launch, owning capacity planning, performance testing, and rollout strategies Lead incident response when things go wrong, run blameless post-mortems, and turn each incident into durable improvements in reliability and tooling
Qualifications
Required
Early to mid-career engineer with 1 to 3 years of experience in site reliability, DevOps, platform, or backend engineering Strong communicator who can break down complex infrastructure tradeoffs for engineers, product partners, and leadership alike Solid grounding in cloud platforms, containerization, CI/CD pipelines, and the fundamentals of distributed systems
Preferred
Prior experience supporting production AI/ML workloads or high-throughput API services at a tech or AI-focused organization Demonstrated comfort operating in fast-moving environments where on-call coverage, incident response, and infrastructure changes happen in parallel Hands-on skills in AWS or GCP, Kubernetes, Terraform, monitoring stacks like Datadog or Prometheus, and scripting in Python or Go
Not the right fit? Search for Site Reliability Engineer jobs in Canada
About Jobright.ai
Jobright is the first AI-native hiring platform that connects top talent with great employers faster than ever before.
ππ¨π π¬πππ€ππ«π¬: Your AI job search agent that turns a solo, time-consuming hunt into a fast, expert-guided path to landing your ideal job.
ππ¦π©π₯π¨π²ππ«π¬: Your AI recruiting partner that delivers only active, qualified candidates with speed and precision.