Top Benefits
Generous paid time off for personal, vacation, parental, and medical leave.
Comprehensive health coverage for employees and families at little or no cost.
About the role
Who you are
- 10+ years as a Site Reliability Engineer, DevOps, or similar role in cloud-native environments (AWS focus)
- Deep technical proficiency with AWS services (EC2, EKS, S3, RDS, IAM, etc.)
- Expert-level experience managing, tuning, and scaling PostgreSQL databases
- Advanced skill in Terraform (modular design, environment promotion, CI/CD integration)
- Proficient in building and operating CI/CD systems (Gitlab CI, GitHub Actions, or equivalent)
- Hands-on experience with GitOps workflows (Argo CD, Flux, etc.)
- Strong knowledge of Kubernetes (deployment, scaling, networking, security)
- Experience with monitoring and logging stacks (DataDog, Prometheus, Grafana, ELK, etc.)
- Track record in designing, communicating, and executing complex infrastructure roadmaps
- Experience mentoring and enabling engineering teams
- Strong written and verbal communication skills
- Professional certifications (AWS Solutions Architect, Kubernetes, Terraform)
- Experience in fin-tech, SaaS, or high-compliance industries
- Exposure to data privacy regulations and secure software development practices
- Bachelors degree in computer science or similar
What the job involves
- We are seeking a seasoned Senior Site Reliability Engineer with deep expertise in AWS to own, architect, and continuously evolve Irwin’s core infrastructure
- You will plan, build, and optimize the systems that support our web applications and internal tools, ensuring scalability, reliability, observability, and security
- Your technical judgment, roadmap planning skills, and hands-on expertise will enable our engineering teams to ship features with velocity and confidence
- Design and execute long-term strategies for scalable, secure infrastructure to host the Irwin web application and associate tooling on AWS/EKS with PostgreSQL
- Architect and manage highly available cloud environments on EKS/Kubernetes using best practices for cost, performance, and security
- Oversee, tune, and ensure the high availability of large-scale PostgreSQL databases; optimize for performance, backup, disaster recovery, and observability. Bonus points for experience using Snowflake or other OLAP systems
- Lead the adoption and maintenance of Terraform workflows to manage infrastructure; ensure reproducibility, modularity, and CI/CD integration
- Build, maintain and scale CI/CD pipelines using GitOps principles to automate deployments, reduce risk, and speed up delivery cycles
- Design, deploy, and manage production-grade Kubernetes clusters; automate scaling and implement robust security practices
- Implement monitoring, logging, and alerting solutions; establish best practices for incident detection and resolution
- Apply industry best practices for infrastructure and data security; ensure governance and compliance with relevant standards (e.g., SOC2, GDPR)
- Mentor SRE peers and engineering teams on DevOps/SRE methodologies; document, communicate, and evangelize infrastructure best practices
The application process
- Please attach your resume and a cover letter describing your approach to architecting scalable infrastructure and your experience with AWS, PostgreSQL, Terraform, GitOps, CI/CD, and Kubernetes
Benefits
- A competitive package offering generous paid time off for personal, vacation, parental, and medical leave.
- Comprehensive health coverage for employees and their families, at little or no cost to employees.
- Discounted services at gyms and wellness facilities.
- Free working lunch in the office Monday through Thursday.
- A social community involved in sports, charities, and in-office events.
- Certification reimbursement for eligible expenses related to the CFA, IPM, CAIA, and FRM exams.
Top Benefits
Generous paid time off for personal, vacation, parental, and medical leave.
Comprehensive health coverage for employees and families at little or no cost.
About the role
Who you are
- 10+ years as a Site Reliability Engineer, DevOps, or similar role in cloud-native environments (AWS focus)
- Deep technical proficiency with AWS services (EC2, EKS, S3, RDS, IAM, etc.)
- Expert-level experience managing, tuning, and scaling PostgreSQL databases
- Advanced skill in Terraform (modular design, environment promotion, CI/CD integration)
- Proficient in building and operating CI/CD systems (Gitlab CI, GitHub Actions, or equivalent)
- Hands-on experience with GitOps workflows (Argo CD, Flux, etc.)
- Strong knowledge of Kubernetes (deployment, scaling, networking, security)
- Experience with monitoring and logging stacks (DataDog, Prometheus, Grafana, ELK, etc.)
- Track record in designing, communicating, and executing complex infrastructure roadmaps
- Experience mentoring and enabling engineering teams
- Strong written and verbal communication skills
- Professional certifications (AWS Solutions Architect, Kubernetes, Terraform)
- Experience in fin-tech, SaaS, or high-compliance industries
- Exposure to data privacy regulations and secure software development practices
- Bachelors degree in computer science or similar
What the job involves
- We are seeking a seasoned Senior Site Reliability Engineer with deep expertise in AWS to own, architect, and continuously evolve Irwin’s core infrastructure
- You will plan, build, and optimize the systems that support our web applications and internal tools, ensuring scalability, reliability, observability, and security
- Your technical judgment, roadmap planning skills, and hands-on expertise will enable our engineering teams to ship features with velocity and confidence
- Design and execute long-term strategies for scalable, secure infrastructure to host the Irwin web application and associate tooling on AWS/EKS with PostgreSQL
- Architect and manage highly available cloud environments on EKS/Kubernetes using best practices for cost, performance, and security
- Oversee, tune, and ensure the high availability of large-scale PostgreSQL databases; optimize for performance, backup, disaster recovery, and observability. Bonus points for experience using Snowflake or other OLAP systems
- Lead the adoption and maintenance of Terraform workflows to manage infrastructure; ensure reproducibility, modularity, and CI/CD integration
- Build, maintain and scale CI/CD pipelines using GitOps principles to automate deployments, reduce risk, and speed up delivery cycles
- Design, deploy, and manage production-grade Kubernetes clusters; automate scaling and implement robust security practices
- Implement monitoring, logging, and alerting solutions; establish best practices for incident detection and resolution
- Apply industry best practices for infrastructure and data security; ensure governance and compliance with relevant standards (e.g., SOC2, GDPR)
- Mentor SRE peers and engineering teams on DevOps/SRE methodologies; document, communicate, and evangelize infrastructure best practices
The application process
- Please attach your resume and a cover letter describing your approach to architecting scalable infrastructure and your experience with AWS, PostgreSQL, Terraform, GitOps, CI/CD, and Kubernetes
Benefits
- A competitive package offering generous paid time off for personal, vacation, parental, and medical leave.
- Comprehensive health coverage for employees and their families, at little or no cost to employees.
- Discounted services at gyms and wellness facilities.
- Free working lunch in the office Monday through Thursday.
- A social community involved in sports, charities, and in-office events.
- Certification reimbursement for eligible expenses related to the CFA, IPM, CAIA, and FRM exams.