Sr. Software Engineer | Up to $150/hr
About the role
About The Job Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark , General Catalyst , Peter Thiel , Adam D'Angelo , Larry Summers , and Jack Dorsey .
Position: SWE Expert
Type: Contract Compensation: $70–$150/hour Role Responsibilities
- Convert high-level objectives into tightly scoped, testable deliverables with clear inputs/outputs and measurable success criteria.
- Create structured documentation defining expected behavior, constraints, and edge cases for reuse by other evaluators.
- Build lightweight automation scripts to support evaluation flows, such as generating required artifacts and validating outputs.
- Write deterministic Python verifier scripts for completion checks via final state or output validation.
- Design prompts/tasks to reliably elicit target workflow behavior while avoiding leakage of internal instructions.
- Implement robust error handling and actionable failure messages in verification tooling.
- Develop plausible but ineffective “baseline” or “distractor” approaches to confirm evaluation discrimination.
- Maintain clean artifact hygiene with versionable structure, consistent naming, and reproducible execution.
Qualifications Must-Have
- Strong Python skills in file system operations, parsing, validation, and deterministic execution.
- Experience with evaluation harnesses, automated grading, or QA-style verification.
- Familiarity with prompt design and LLM evaluation methodologies.
- Comfort with structured specs and documentation conventions like Markdown and YAML.
- Working knowledge of Git, CLI workflows, virtual environments, and dependency management.
Preferred
- Knowledge of embeddings/similarity concepts like cosine similarity for negative-control design.
- Ability to communicate clearly and control scope without relying on domain-specific context.
Application Process (Takes 20–30 mins to complete)
- Upload resume
- AI interview based on your resume
- Submit form
Resources & Support
- For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome/welcome
- For any help or support, reach out to: support@mercor.com
PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity. ,
Similar jobs you might like
Sr. Software Engineer | Up to $150/hr
About the role
About The Job Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark , General Catalyst , Peter Thiel , Adam D'Angelo , Larry Summers , and Jack Dorsey .
Position: SWE Expert
Type: Contract Compensation: $70–$150/hour Role Responsibilities
- Convert high-level objectives into tightly scoped, testable deliverables with clear inputs/outputs and measurable success criteria.
- Create structured documentation defining expected behavior, constraints, and edge cases for reuse by other evaluators.
- Build lightweight automation scripts to support evaluation flows, such as generating required artifacts and validating outputs.
- Write deterministic Python verifier scripts for completion checks via final state or output validation.
- Design prompts/tasks to reliably elicit target workflow behavior while avoiding leakage of internal instructions.
- Implement robust error handling and actionable failure messages in verification tooling.
- Develop plausible but ineffective “baseline” or “distractor” approaches to confirm evaluation discrimination.
- Maintain clean artifact hygiene with versionable structure, consistent naming, and reproducible execution.
Qualifications Must-Have
- Strong Python skills in file system operations, parsing, validation, and deterministic execution.
- Experience with evaluation harnesses, automated grading, or QA-style verification.
- Familiarity with prompt design and LLM evaluation methodologies.
- Comfort with structured specs and documentation conventions like Markdown and YAML.
- Working knowledge of Git, CLI workflows, virtual environments, and dependency management.
Preferred
- Knowledge of embeddings/similarity concepts like cosine similarity for negative-control design.
- Ability to communicate clearly and control scope without relying on domain-specific context.
Application Process (Takes 20–30 mins to complete)
- Upload resume
- AI interview based on your resume
- Submit form
Resources & Support
- For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome/welcome
- For any help or support, reach out to: support@mercor.com
PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity. ,