Please mention DailyRemote when applying
About the Role
At LatchBio, we build the benchmarks that frontier AI labs use to evaluate and train models on biological reasoning. RefusalBench tests whether AI systems can distinguish legitimate biological research from requests that present meaningful biosecurity risks.
We're looking for scientists with deep expertise in areas such as biosecurity, biosafety, pathogen genomics, infectious disease research, synthetic biology, biodefense, and public health. Your role will be to apply that expertise to determine where the boundary lies between routine scientific work and potentially dangerous biological capabilities.
You will review real-world biological analyses, protocols, datasets, and research papers to establish ground truth for AI evaluations. Some tasks should clearly be allowed. Others should clearly be refused. Many sit in the gray area. Your job is to identify the relevant risks, justify the correct decision, and convert those examples into structured evaluations that test whether AI systems reach the same conclusion.
What You'll Do
Review biological tasks, datasets, protocols, and research workflows to determine whether they should be accepted or refused from a biosecurity perspective
Identify the specific risk categories involved, including pathogen engineering, immune evasion, enhanced transmissibility, countermeasure resistance, and other dual-use concern
Source real-world examples from the scientific literature and convert them into evaluation tasks with clear ground truth and supporting rational
Document the reasoning and evidence supporting each decision so that evaluations are scientifically defensible and consistently graded
Review agent outputs to determine whether models correctly identify risks, over-refuse legitimate research, or fail to recognize dangerous capabilities
Help build a high-quality corpus of biosecurity evaluations grounded in real biological research and real-world risk assessment
Ideal Backgrounds
Biosecurity & biodefense programs
Biosafety and risk assessment organizations
Public health agencies (CDC, WHO, state health departments)
Pathogen genomics and molecular epidemiology groups
Synthetic biology and biotechnology governance
Pandemic preparedness and emerging infectious disease programs
Infectious disease surveillance and outbreak response
Compensation & Logistics
Salary: $120k–$180k (base + performance pay that scales with output)
Produce 50 evals/week at baseline — exceed that and performance pay scales linearly
Fast-track promotion for strong performers
Fully remote or onsite
Equity
100% premium-covered Blue Shield Platinum health plan ($0/$0)
Unlimited PTO
Visa sponsorship available
Hiring Process
Intro call with Saúl (Technical Recruiter)
Technical interview with Harmon (Pod Lead)
Cultural interview with Jordan (Chief of Staff)
Offer
Location: San Francisco, CA. In-person.
Stop the endless job search. Our AI finds and applies to the best jobs for you.
Discover remote opportunities in Software Development
Answer easy questions
200,000+ jobs across 15+ categories
Get your best job matches
Only hand-screened, legit jobs
Find a remote job faster
No ads, scams, or junk
“ I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!