Staff Distributed Systems Engineer
Remote
United States, Canada
$164,000 - $289,000/yearly
Staff
Top Benefits
Inclusive healthcare coverage
401(k) retirement plan
Flexible paid time off
About the role
Who you are
- BA/BS degree or equivalent experience
- At least 7, preferably 10+ years of building and operating large-scale production distributed systems where latency, correctness, and reliability (99.99% uptime) are non-negotiable
- Deep backend systems experience in one or more modern server environments (e.g., Java, Go, Rust, Python, Node.js, etc.), with the ability to ramp and adapt quickly in new stacks
- Expertise with distributed systems, concurrency, scaling, and debugging multi-layer systems
- Strong operational judgment: you define SLIs/SLOs, build observability, and improve systems via incidents and feedback loops, not heroics
- Staff behaviors: you lead multi-team initiatives, write decision-quality design docs, influence architecture beyond your immediate team, and communicate across the organization.
- Ability to make decisions with incomplete information, understand and communicate one-way vs. two-way doors, and move with urgency while keeping critical code operational
- Stay curious and open to growth — actively building fluency in emerging technologies like AI to unlock creativity, accelerate progress, and amplify impact
What the job involves
- Reporting to the Senior Manager, Engineering
- Collaborate with exceptional engineers on building systems and services for the world's largest companies. This platform powers millions of production websites and supports massive scale, including over 1% of global Internet traffic and more than 10 billion monthly visits
- Lead architecture for distributed services at scale that synchronize shared state across clients, including clear correctness guarantees (eg: ordering, idempotency, convergence). These services require low latency and high availability, with SLO of 99.99% uptime
- Define concurrency and conflict-resolution semantics for concurrent changes, including trade-offs and constraints
- Design for failure: retries, partial outages, reconnection, and safe recovery paths, with explicit degradation behavior
- Own operational excellence: define SLIs/SLOs, instrument tracing/metrics/logging, and drive reliability improvements through incident learning
- Drive cross-team technical alignment via design docs and decision records; unblock execution across org boundaries
- Raise the bar through design and code reviews, mentoring, and pragmatic standardization that increases leverage
- Deliver maintainable, tested, performant systems and evolve them with a “crawl, walk, run” plan
- Use modern tooling (including AI-assisted coding, debugging and code review) to improve developer velocity and reduce time-to-diagnosis in production
- Participate in engineering citizenship activities such as co-authoring engineering blogs, strengthening and improving our hiring processes, and leading internal hackathon teams
The application process
- Application deadline: applications accepted on an ongoing basis until position is closed and filled
Benefits
- Modern & inclusive healthcare coverage
- 401K and financial planning
- Flexible paid time off
- Annual retreat and offsites
- WFH Office setup budget
- Health and wellness stipend
- Remote work reimbursements for phone & wifi
- Webflow subscription discount
- Remote-first flexibility
Staff Distributed Systems Engineer
Remote
United States, Canada
$164,000 - $289,000/yearly
Staff
Top Benefits
Inclusive healthcare coverage
401(k) retirement plan
Flexible paid time off
About the role
Who you are
- BA/BS degree or equivalent experience
- At least 7, preferably 10+ years of building and operating large-scale production distributed systems where latency, correctness, and reliability (99.99% uptime) are non-negotiable
- Deep backend systems experience in one or more modern server environments (e.g., Java, Go, Rust, Python, Node.js, etc.), with the ability to ramp and adapt quickly in new stacks
- Expertise with distributed systems, concurrency, scaling, and debugging multi-layer systems
- Strong operational judgment: you define SLIs/SLOs, build observability, and improve systems via incidents and feedback loops, not heroics
- Staff behaviors: you lead multi-team initiatives, write decision-quality design docs, influence architecture beyond your immediate team, and communicate across the organization.
- Ability to make decisions with incomplete information, understand and communicate one-way vs. two-way doors, and move with urgency while keeping critical code operational
- Stay curious and open to growth — actively building fluency in emerging technologies like AI to unlock creativity, accelerate progress, and amplify impact
What the job involves
- Reporting to the Senior Manager, Engineering
- Collaborate with exceptional engineers on building systems and services for the world's largest companies. This platform powers millions of production websites and supports massive scale, including over 1% of global Internet traffic and more than 10 billion monthly visits
- Lead architecture for distributed services at scale that synchronize shared state across clients, including clear correctness guarantees (eg: ordering, idempotency, convergence). These services require low latency and high availability, with SLO of 99.99% uptime
- Define concurrency and conflict-resolution semantics for concurrent changes, including trade-offs and constraints
- Design for failure: retries, partial outages, reconnection, and safe recovery paths, with explicit degradation behavior
- Own operational excellence: define SLIs/SLOs, instrument tracing/metrics/logging, and drive reliability improvements through incident learning
- Drive cross-team technical alignment via design docs and decision records; unblock execution across org boundaries
- Raise the bar through design and code reviews, mentoring, and pragmatic standardization that increases leverage
- Deliver maintainable, tested, performant systems and evolve them with a “crawl, walk, run” plan
- Use modern tooling (including AI-assisted coding, debugging and code review) to improve developer velocity and reduce time-to-diagnosis in production
- Participate in engineering citizenship activities such as co-authoring engineering blogs, strengthening and improving our hiring processes, and leading internal hackathon teams
The application process
- Application deadline: applications accepted on an ongoing basis until position is closed and filled
Benefits
- Modern & inclusive healthcare coverage
- 401K and financial planning
- Flexible paid time off
- Annual retreat and offsites
- WFH Office setup budget
- Health and wellness stipend
- Remote work reimbursements for phone & wifi
- Webflow subscription discount
- Remote-first flexibility