GPU Performance Engineer
Remote
Cambridge
£94,642 - £175,764/yearly
JobCard.seniorityLevels.mid_level
About the role
Who you are
- Strong experience with GPU computing and performance analysis
- Experience with large-scale or distributed GPU systems
- Hands-on experience running and interpreting AI benchmarks and workloads
- Familiarity with scientific computing or HPC workloads
- Proven background in performance optimization and profiling
- Experience developing internal tooling or benchmarking frameworks
- Solid understanding of the full stack, including: GPU architectures and memory hierarchies, Drivers, runtimes, and system software, AI frameworks and numerical libraries
- Knowledge of modern AI workloads (training and inference) and model architectures
- Strong coding and debugging skills (e.g., C/C++, Python, CUDA or similar)
- Experience with Linux systems and low-level debugging
- Experience using performance analysis and profiling tools
- Ability to reason about complex systems and explain performance behavior clearly
What the job involves
- We are seeking a GPU Performance Engineer to evaluate, analyze, and optimize performance across AI workloads and scientific computing applications. In this role, you will run and develop open-source AI benchmarks, build tooling to automate benchmarking and result collection, and deeply analyze performance results to identify and resolve bottlenecks across the full hardware and software stack
- This position is ideal for someone who enjoys low-level performance work, hands-on experimentation, and turning complex performance data into actionable insights
- Run, develop, and maintain open-source AI benchmarks (training and inference workloads) as well as custom AI and scientific computing workloads
- Design and implement benchmarking and automation tools to execute workloads, collect results, and ensure reproducibility
- Analyze and interpret performance data to identify compute, memory, communication, and I/O bottlenecks
- Perform performance optimization across models, kernels, libraries, and system configurations
- Troubleshoot performance issues across the full hardware/software stack, including GPUs, CPUs, interconnects, drivers, runtimes, and frameworks
- Collaborate with researchers, engineers, and systems teams to improve performance and efficiency
- Document findings and clearly communicate performance insights and recommendations
The application process
- We start with a 30 min screening call
- We then follow with a 45 minutes introductory call with Dr Rosemary Francis, our CTO
- Finally a 2 hour technical deep dive with the other team members
- During the hiring process we may ask you technical questions about any technologies or experience that you list on your CV. You will be expected to write short snippets of code to solve specific problems
- We work on many joint projects with our sister organization, CommonAI CIC. By applying for this role you give us permission to share your details with CommonAI CIC so that we may consider you for open roles across both organizations. If you prefer not to have your details shared with CommonAI CIC or would not like to be considered for other roles then please let us know when you apply. This will not affect your application for this role
GPU Performance Engineer
Remote
Cambridge
£94,642 - £175,764/yearly
JobCard.seniorityLevels.mid_level
About the role
Who you are
- Strong experience with GPU computing and performance analysis
- Experience with large-scale or distributed GPU systems
- Hands-on experience running and interpreting AI benchmarks and workloads
- Familiarity with scientific computing or HPC workloads
- Proven background in performance optimization and profiling
- Experience developing internal tooling or benchmarking frameworks
- Solid understanding of the full stack, including: GPU architectures and memory hierarchies, Drivers, runtimes, and system software, AI frameworks and numerical libraries
- Knowledge of modern AI workloads (training and inference) and model architectures
- Strong coding and debugging skills (e.g., C/C++, Python, CUDA or similar)
- Experience with Linux systems and low-level debugging
- Experience using performance analysis and profiling tools
- Ability to reason about complex systems and explain performance behavior clearly
What the job involves
- We are seeking a GPU Performance Engineer to evaluate, analyze, and optimize performance across AI workloads and scientific computing applications. In this role, you will run and develop open-source AI benchmarks, build tooling to automate benchmarking and result collection, and deeply analyze performance results to identify and resolve bottlenecks across the full hardware and software stack
- This position is ideal for someone who enjoys low-level performance work, hands-on experimentation, and turning complex performance data into actionable insights
- Run, develop, and maintain open-source AI benchmarks (training and inference workloads) as well as custom AI and scientific computing workloads
- Design and implement benchmarking and automation tools to execute workloads, collect results, and ensure reproducibility
- Analyze and interpret performance data to identify compute, memory, communication, and I/O bottlenecks
- Perform performance optimization across models, kernels, libraries, and system configurations
- Troubleshoot performance issues across the full hardware/software stack, including GPUs, CPUs, interconnects, drivers, runtimes, and frameworks
- Collaborate with researchers, engineers, and systems teams to improve performance and efficiency
- Document findings and clearly communicate performance insights and recommendations
The application process
- We start with a 30 min screening call
- We then follow with a 45 minutes introductory call with Dr Rosemary Francis, our CTO
- Finally a 2 hour technical deep dive with the other team members
- During the hiring process we may ask you technical questions about any technologies or experience that you list on your CV. You will be expected to write short snippets of code to solve specific problems
- We work on many joint projects with our sister organization, CommonAI CIC. By applying for this role you give us permission to share your details with CommonAI CIC so that we may consider you for open roles across both organizations. If you prefer not to have your details shared with CommonAI CIC or would not like to be considered for other roles then please let us know when you apply. This will not affect your application for this role