Top Benefits
About the role
Who you are
- We're searching for passionate individuals eager to contribute to Alpaca's rapid growth
- If you align with our core values—Stay Curious, Have Empathy, and Be Accountable—and are ready to make a significant impact, we encourage you to apply
- 5+ years of experience in Site Reliability Engineering, Performance Engineering, or similar roles
- 5+ years of experience with multi-terabyte scale PostgreSQL clusters
- Proven track record of managing and maintaining large-scale, high-availability, and high-performance PostgreSQL database
- Experience designing and implementing SLIs, SLOs, and SLAs for internal systems and databases
- Experience with troubleshooting PostgreSQL performance problems and slow queries
- Extensive experience with efficient schema design and efficient query design
- Experience migrating multi-terabyte tables into more efficient schemas
- Proficient with Go
- Proficient with Prometheus
- Proficient with Linux
- Knowledgeable in trading/fintech domains
- Experience with low-latency systems
- Experience with distributed tracing
- Experience scaling PostgreSQL clusters rapidly
- Experience with pgx, gorm, or sqlc
What the job involves
- As a Site Reliability Engineer (SRE) at Alpaca, you will ensure the reliability, scalability, and performance of our systems and services
- You will work closely with development, operations and devops teams to build and maintain robust applications, ensuring they run smoothly and efficiently
- This role requires a blend of software engineering and operations skills, with a strong ability to troubleshoot technical issues and resolve problems before they impact our users
- Triage difficult technical problems and implement solutions
- Improve our observability stack (monitoring, logging, profiling)
- Incident Management: Respond to and resolve incidents in a timely manner, conducting post-incident reviews to identify and implement improvements
- Collaboration: Work closely with development teams to ensure new features and services are designed with reliability and scalability in mind
- Capacity Planning: Monitor system capacity and performance, making recommendations and implementing changes to handle future growth
Benefits
- Competitive Salary & Stock Options
- Benefits: Health benefits start on day 1. In the US this includes Medical, Dental, Vision. In Canada, this includes supplemental health care. Internationally, this includes a stipend value to offset medical costs
- New Hire Home-Office Setup: One-time USD $500
- Monthly Stipend: USD $150 per month via a Brex Card
- Work with awesome people, clients and partners from around the world
About Alpaca
Alpaca is a developer-first API brokerage platform that supports hundreds of businesses globally. Alpaca offers stock, options, ETF and crypto trading, real-time market data, and end-to-end brokerage infrastructure through modern APIs.
Alpaca has raised over $120m in funding and is backed by top investors in the industry globally, including Portage Ventures, Spark Capital, Tribe Capital, Social Leverage, Horizons Ventures, Unbound, SBI Group, Eldridge, Positive Sum, Elefund, and Y Combinator.
Top Benefits
About the role
Who you are
- We're searching for passionate individuals eager to contribute to Alpaca's rapid growth
- If you align with our core values—Stay Curious, Have Empathy, and Be Accountable—and are ready to make a significant impact, we encourage you to apply
- 5+ years of experience in Site Reliability Engineering, Performance Engineering, or similar roles
- 5+ years of experience with multi-terabyte scale PostgreSQL clusters
- Proven track record of managing and maintaining large-scale, high-availability, and high-performance PostgreSQL database
- Experience designing and implementing SLIs, SLOs, and SLAs for internal systems and databases
- Experience with troubleshooting PostgreSQL performance problems and slow queries
- Extensive experience with efficient schema design and efficient query design
- Experience migrating multi-terabyte tables into more efficient schemas
- Proficient with Go
- Proficient with Prometheus
- Proficient with Linux
- Knowledgeable in trading/fintech domains
- Experience with low-latency systems
- Experience with distributed tracing
- Experience scaling PostgreSQL clusters rapidly
- Experience with pgx, gorm, or sqlc
What the job involves
- As a Site Reliability Engineer (SRE) at Alpaca, you will ensure the reliability, scalability, and performance of our systems and services
- You will work closely with development, operations and devops teams to build and maintain robust applications, ensuring they run smoothly and efficiently
- This role requires a blend of software engineering and operations skills, with a strong ability to troubleshoot technical issues and resolve problems before they impact our users
- Triage difficult technical problems and implement solutions
- Improve our observability stack (monitoring, logging, profiling)
- Incident Management: Respond to and resolve incidents in a timely manner, conducting post-incident reviews to identify and implement improvements
- Collaboration: Work closely with development teams to ensure new features and services are designed with reliability and scalability in mind
- Capacity Planning: Monitor system capacity and performance, making recommendations and implementing changes to handle future growth
Benefits
- Competitive Salary & Stock Options
- Benefits: Health benefits start on day 1. In the US this includes Medical, Dental, Vision. In Canada, this includes supplemental health care. Internationally, this includes a stipend value to offset medical costs
- New Hire Home-Office Setup: One-time USD $500
- Monthly Stipend: USD $150 per month via a Brex Card
- Work with awesome people, clients and partners from around the world
About Alpaca
Alpaca is a developer-first API brokerage platform that supports hundreds of businesses globally. Alpaca offers stock, options, ETF and crypto trading, real-time market data, and end-to-end brokerage infrastructure through modern APIs.
Alpaca has raised over $120m in funding and is backed by top investors in the industry globally, including Portage Ventures, Spark Capital, Tribe Capital, Social Leverage, Horizons Ventures, Unbound, SBI Group, Eldridge, Positive Sum, Elefund, and Y Combinator.