About the Role
You'll help define an emerging area: how to find and neutralize the security risks that emerge when agents act, plan, and use tools autonomously. This role is research-heavy and engineering-heavy: you'll design experiments, build prototypes, fine-tune models, and pressure-test systems against adversarial behavior. You'll iterate quickly, learn from failures, and scale what works, while building the monitoring and evaluation infrastructure that makes progress measurable. You can see what we're building here.
In this role you will:
- Define and validate threat models for agentic systems, identifying which tool characteristics must co-exist to enable data exfiltration and malicious state change, and how to break those combinations
- Design and run experiments: create synthetic environments like file systems and tools, create task distributions that have attack paths and apply different attack strategies
- Break (manually and using optimization algorithms such as RL) agentic systems in
- Design and improve static and dynamic analysis methods that automatically map tool capabilities to risk across diverse tool ecosystems, and make those methods scale
- Turn research insights into product-facing capabilities: risk classification, automated guardrail generation, and quantitative threat measurement
- Build measurement tools: eval harnesses, monitoring, dashboards, and feedback loops that quantify security outcomes
- Build capability and regression evals
- Optimize systems for real-world constraints (latency, cost, reliability) without losing scientific rigor
You might thrive in this role if:
- You have an MS or PhD in CS/ML (or equivalent research experience) and enjoy working under uncertainty
- You've fine-tuned and evaluated models in practice and can reason about data quality, overfitting, evals, and deployment constraints
- You can write strong production code, and you're comfortable owning the infrastructure that makes agentic evals run end-to-end. You care about reproducibility and instrumentation. No AI slop.
- You're motivated by security problems and enjoy thinking like both builder and attacker
- You reason about how capabilities combine into risk: not just individual vulnerabilities, but system-level attack surfaces across tool ecosystems
- You communicate clearly, iterate fast, and can hold a technical narrative from "hypothesis" to "shipped"
What we offer
- Competitive salary + equity, so you share in the company's upside
- Work at the forefront of AI security, helping define a new category
- Remote-friendly, with a preference for candidates based in Amsterdam, Paris, Poland, New York, or San Francisco
- Fully funded team retreats every 8 weeks
- Health insurance allowance for you and your dependents
- Wellbeing, learning, and home office allowances (to support health, growth, and your setup)