Senior Director / VP, AI Production Reliability & Trust

 Posted 2 months ago
     
10+ years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

This role is accountable for ensuring the production reliability of the Hybrid AI Platform and customer deployments under real-world constraints, while also building frameworks for enterprise customers to trust autonomous AI agents performing work on their behalf. The focus is on runtime governance for non-deterministic, agentic AI systems serving high-stakes clients.

The Role

The Senior Director / VP of AI Production Reliability & Trust is accountable for two things:

1.     Production reliability: ensuring HAP (our Hybrid AI Platform) and our customer deployments operate dependably under real traffic, real data, and real regulatory constraints.

2.     Agent trust: building the frameworks — technical and operational — that allow enterprise customers to trust autonomous AI agents doing work on their behalf.

 

This is not a pre-release testing role. It is not a test automation role. It is not a QA team management role.

This is a runtime governance role for non-deterministic, agentic AI systems. The systems you govern make decisions autonomously. The customers who depend on them are sovereign governments and large enterprises with no tolerance for unpredictable agent behavior.

 

What you will actually build

       A production quality operating system: quality gates, phase transition criteria, incident taxonomy, observability spec across our 6-layer Reference Architecture

       A continuous validation framework for agentic workflows — not test scripts run by humans, but autonomous evaluation pipelines that catch regression without human intervention

       An agent decision qualification framework: risk-tiered oversight for autonomous agent decisions, from ephemeral actions that need no review to high-stakes decisions that require multi-model consensus

       A trust evidence system: the observable signals — audit trails, behavioral consistency records, policy compliance evidence — that enterprise customers use to extend trust to agents operating on their behalf

       Production observability: instrumentation across Ingest, Prepare, Serve, Orchestrate, Monitor, and Optimize layers of the Reference Architecture

       A post-mortem and CAPA system: every production incident produces a root cause, a corrective action, and a new test that prevents recurrence

 

What we are not looking for

We want to be direct so you don't waste your time:

       Leaders who will hire a team first and direct them to build — we need someone who builds first and delegates second

       Candidates whose answer to "how would you do X?" is "I'd talk to my network" or "I'd evaluate vendors" — we need someone who already has answers

       Enterprise QA professionals whose toolkit is Selenium, Datadog, LoadRunner, or similar AI-washed commercial tools — we use open-source and next-gen frameworks and we expect you to know them

       Candidates whose production AI experience means "I oversaw a team monitoring an AI model" — we need someone who has implemented governance for autonomous agent systems

       People who need defined scope and predictable hours to do their best work

 

What we are looking for

       10+ years in quality, reliability, or production operations for complex distributed systems — with at least some of that time governing AI or ML systems in live production

       Direct implementation experience with AI quality frameworks — you built it, not just led a team that built it

       Familiarity with the agentic AI quality problem: non-deterministic systems, hallucination detection, behavioral drift, autonomous decision governance

       Working knowledge of open-source evaluation and observability frameworks (LangSmith, Arize/Phoenix, RAGAS, PromptFlow, Weights & Biases, or similar) — not just commercial alternatives

       Background in regulated industries (financial services, telecom, healthcare, government) where AI quality failures have real contractual and commercial consequences

       Startup orientation: comfortable with ambiguity, iterative scope, and a team that moves faster than most people expect

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Software Development

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified