About mpathic
mpathic is building the future of trustworthy AI. Grounded in behavioral science and human-centered design, we provide the infrastructure for building AI systems that are safe, aligned, and emotionally intelligent.
Building on our work in areas like RL gyms, red teaming, and benchmarking, we are creating the foundation for training, probing, and measuring advanced AI systems reliably, auditably, and at scale.
Position Overview
mpathic is seeking a Program Manager, Human Data Operations to lead the execution of complex AI safety, evaluation, human data and red teaming for leading AI companies. You will report directly to the Co-Founder/Chief Innovation Officer. This role owns end-to-end delivery across cross-functional project teams and is accountable for project execution, quality, stakeholder communication, and customer outcomes.
The ideal candidate has significant experience managing large-scale services operations, including distributed teams of experts, annotators, reviewers, red teamers, contractors, and quality assurance personnel. They are comfortable operating in fast-paced startup environments where priorities evolve quickly and successful delivery requires proactive communication, operational rigor, and strong judgment.
You will act as the connective tissue between Data Services, QA, Engineering, AI/ML, Product, Data Science, and customer-facing teams—translating ambiguous customer and operational needs into dependable delivery, and keeping the many people involved aligned around what we are building and why.
This is a role about both execution and leadership. You should be as comfortable coordinating distributed expert workflows and managing customer expectations as you are building systems that improve project visibility, quality, and throughput. You will start as a hands-on individual contributor owning your delivery pod, with room to shape how program management scales here as the team grows.
What You’ll Accomplish
In your first 60–90 days you’ll…
- Build a deep understanding of mpathic’s human data, AI safety, red teaming, evaluation, annotation, QA, and reporting workflows, and a clear picture of how work flows from contract handoff to final delivery.
- Shadow and then take ownership of one or more active client programs -driving timelines, deliverable tracking, staffing coordination, risk management, and stakeholder communication.
- Become proficient in the systems, tools, trackers, reporting mechanisms, and communication workflows used across projects.
- Establish trusted working relationships with leadership, execution teams, customers and stakeholders internally and externally
- Begin identifying operational bottlenecks, process gaps, and opportunities for improvement.
In your first year you’ll…
- Own end-to-end delivery planning, staffing, execution, quality management, and reporting for multiple concurrent customer engagements.
- Coordinate cross-functional project pods consisting of experts, annotators, reviewers, red teamers, QA personnel, researchers, engineers, and data scientists.
- Build scalable systems that improve project visibility, quality, throughput, or operational efficiency.
- Partner with executives and functional leaders to forecast capacity and support business growth.
- Contribute to customer expansion opportunities through exceptional delivery and stakeholder management.
- Help define how program management scales at mpathic as the platform and team grow.
You’ll Thrive in This Role If You…
- Translate ambiguity into execution: you can take a complex, evolving customer program and turn it into a clear project plan, a staffing model, and a risk-managed delivery path.
- Bring operational rigor and strong judgment, and can proactively identify risks before they impact delivery commitments.
- Have designed or managed annotation workflows, QA processes, or expert-driven data programs, and understand what it takes to make human-data operations trustworthy at scale.
- Understand modern AI evaluation, red teaming, and human-in-the-loop concepts, and how to operationalize them into repeatable, measurable workflows.
- Are comfortable with high-stakes, sensitive subject matter such as AI safety, red teaming, and trust & safety, and bring care for both customers and the teams doing the work.
- Communicate crisply across project plans, status updates, and escalation paths, and can align distributed teams around ownership and timelines without relying on authority.
- Are energized by being the connective tissue across Human Data, QA, Engineering, Product, Research, and Customer Success teams.
- Have delivered large-scale programs in complex operational environments, shown through what you have delivered rather than years on a résumé. Experience in AI safety, LLM evaluation, or trust & safety operations is a strong plus.
What You’ll Do
Own Project Execution, Contract Handoff to Final Delivery (Core)
- Be the single operational lead for the delivery pods you own, carrying each program from intake through staffing, execution, quality review, and customer sign-off.
- Build and coordinate project teams
- Establish project plans, staffing models, timelines, communication cadences, and risk mitigation plans.
- Monitor project performance, utilization, throughput, quality metrics, and budget performance.
Manage Customer Relationships & Delivery Milestones
- Manage customer expectations and delivery milestones throughout the engagement lifecycle.
- Lead internal project meetings, status reviews, and customer-facing reporting cadences.
- Identify expansion opportunities and surface operational improvements to leadership.
Build Quality & Operational Systems
- Define what “good” means for the programs you own, and the quality controls and QA processes that measure it.
- Set annotation and human-data quality standards that are consistent, auditable, and humane for the people doing the work.
- Build scalable operational systems that improve project visibility, quality, or throughput across the Human Data team.
Capture Inputs from Customers & Delivery Teams
- Partner with pre-sales and customer-facing teams to translate customer requirements into executable delivery plans.
- Bring front-line delivery insights into capacity planning and operational improvement on purpose.
- Turn what you learn into systemic improvements rather than one-off workarounds.
Be the Connective Tissue
- Keep Human Data, QA, Engineering, Product, Research, and Customer Success aligned on what is being delivered and why.
- Translate between operational, technical, and commercial perspectives so the right tradeoffs get made.
About the Team
You will work closely with:
- Human Data leadership and QA leads: to staff, coordinate, and deliver high-quality programs.
- Engineering, AI/ML, and Research: to resolve technical blockers and integrate tooling into workflows.
- Sales and Customer Success: to align what we deliver with customer needs, expectations, and expansion opportunities.
- We value operational excellence, systems thinking, and high standards for quality, clarity, and auditability.
Required Experience
- 5+ years of program management, project delivery, operations leadership, or related experience.
- Demonstrated experience managing large-scale human data, annotation, evaluation, trust & safety, research operations, or expert-driven workflows.
- Experience leading distributed teams of contractors, reviewers, annotators, experts, or QA personnel.
- Experience managing customer-facing programs with executive stakeholders.
- Experience operating in startup or high-growth environments.
Preferred Qualifications
- PMP, PgMP, CAPM, Agile, Scrum, Lean, or equivalent project management certification strongly preferred.
- Experience with AI safety, LLM evaluation, red teaming, human-in-the-loop systems, or trust & safety operations.
- Experience building scalable operational systems and quality management processes.
Apply Even If You Don’t Check Every Box
If you’re excited about bringing clinical judgment, training excellence, and quality systems into AI safety evaluation work—and want to help ensure emotionally grounded AI systems are safe and trustworthy—we’d love to hear from you.