The role
You'll design and build LLM-powered systems and agents that integrate into and perform real offensive security work. Think of it as taking what a strong pentester or red teamer does intuitively and encoding it into systems that operate continuously, and at scale.
We're hiring two people for this team, and strength can come from either of two directions:
- You're an offensive security specialist, a pentester or red teamer, who wants to go all-in on AI engineering, or
- You're a strong engineer / AI engineer who's drawn to offensive security and the agent problems that live there.
The work is the same either way. You only need to arrive deep in one of these two areas, we'll back you to grow fast in whichever side is newer to you.
What you'll work on
- Autonomous hacking agents. LLM agents that perform full attacks with minimal human intervention, including, but not limited to: reconnaissance, web application exploitation, external exploration, post-exploitation reasoning, and reporting.
- Agent architecture. Planning loops, tool use, memory, and state management that let agents reason across long, multi-step attack chains without getting lost.
- Prompting and orchestration. Translating attacker workflows and tradecraft into prompts, decision graphs, and orchestration layers that hold up under adversarial conditions.
- Tool integration. Writing custom agent tooling, integrating third party tools as well as wrapping the offensive security toolchain (Burp, ffuf, nuclei, sqlmap, custom exploits, browser automation, etc.) into agent-callable interfaces.
- AI feature development. Building AI solutions that solve real-world problems our consultants face, from building in-house chatbot interfaces to text-generation features for strategy ideation and more.
- Evaluation and benchmarking. Building eval harnesses and realistic target labs so we can measure AI feature and agent performance with rigour, not vibes.
- False positive reduction. A finding only counts if it's real. You'll work on validation logic, exploit confirmation, and reasoning chains that eliminate hallucinated vulnerabilities.
- Production reliability. Testable, observable, maintainable systems.
- Broader applied AI and tool research. AI moves fast, expect to regularly be researching new approaches and technologies to see what’s really worth it and what’s just hype.
This is a broad, evolving remit - the balance between agentic systems, applied AI, and research will shift as the problem does. You'll own your areas end to end, from a rough idea through to a shipped, working system, rather than being handed tightly scoped tickets.
Who you are
We expect strong candidates to come from one of two backgrounds:
- The offensive security specialist. You've spent years on the offensive side and you've hit a ceiling on what one person can find by hand. You've started building scripts, wrappers, and small agents to scale yourself, and you've realised AI engineering is where the leverage is. You know what a real attacker actually does - how to chain a misconfigured S3 bucket into a foothold, when an IDOR is worth pivoting on, why a session token format matters. That intuition is what separates an agent that produces noise from one that produces exploit-validated, actionable findings.
- The AI engineer. You're a strong engineer who's gone deep on LLMs and agents. You understand why most agent demos fall apart on the second step, what good tool use looks like, and how to debug a planning loop that's quietly going off the rails. You've watched the offensive security space long enough to know it's where some of the most interesting agent problems live, and you're hungry to get into it properly.
Whichever you are: you write good quality code, you think rigorously, you take problems and run with them end to end, and you want to become the most security-fluent person in the AI room and the most AI-fluent person in the security room.
Essential: what we need from everyone
- Hands-on experience building with LLMs - using the likes of Claude Code, and Codex to build out projects and accelerate dev workflows.
- Engineering ability in Python or TypeScript - you can write and ship reliable code.
- Some experience building AI solutions: even if at a small scale, you have experience writing prompts, thinking about tool use, agent loops and/or AI evaluation. From work, open-source contributions, or side projects.
- Strong ownership and a self-starter mindset. You take problems end to end: from a vague prompt to a shipped, working system without waiting to be told what to do next. You're proactive, you set your own direction, and you close the loop.
- A big-picture thinker, comfortable with ambiguity. This is a frontier problem on a small, fast-moving team where not everything is mapped out and the right answer often isn't known yet. You keep the wider goal in view, make good technical bets with incomplete information, and iterate fast.
- Clear written communication. You can explain a complex finding to an engineer, and a complex agent failure mode to a non-AI specialist.
Essential: depth in one of these two areas
You don't need both. We need genuine depth in one, and real interest in the other.
If you come from offensive security:
- Professional offensive security experience: penetration testing, red teaming, bug bounty, or equivalent. You've found real vulnerabilities in production systems and you understand the difference between a CVE database entry and an actual exploit chain.
If you come from AI / software engineering:
- Strong production engineering: architecting and owning systems end to end, including deployments and CI/CD pipelines.
Strongly preferred
- Strong AI engineering experience with agent frameworks (LangGraph, CrewAI, custom orchestration).
- Experience evaluating LLM systems in production.
Nice to have
Knowledge of cloud infrastructure and CI/CD pipelines
The Perks
Join a team that values both excellence and balance:
- True remote flexibility - work from anywhere.
- No report-writing drudgery - we use our custom portal.
- Unlimited training to keep your skills sharp.
- Unlimited vacation - because burnout helps no one.
- Private medical insurance and pension scheme.
- Conference speaking bonuses.
- Hardware, software, lab environments, cloud credits and research materials you need to excel.
- A culture of radical candor, continuous improvement and technical excellence.
The Culture
At CovertSwarm, we take pride in pushing the boundaries of offensive security. Our team consists of passionate and humble professionals who value creativity, technical depth and delivering results that matter.
In this role, you will help shape how CovertSwarm attacks and reviews the infrastructure underpinning digital asset markets: keys, signers, wallets, APIs, smart contracts, settlement rails, cloud platforms, control planes and the humans operating them.
If you want to work at the intersection of offensive security, blockchain, encryption and financial infrastructure, we want to hear from you.
Ready to join the Swarm?
Take the next step in your cybersecurity career by applying today. Let’s talk about how your skills, research mindset and offensive capability align with CovertSwarm’s mission to redefine offensive security.