About Centific
Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe, scalable AI deployment. Our team includes more than 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. We harness the power of an integrated solution ecosystem—comprising industry-leading partnerships and 1.8 million vertical domain experts in more than 230 markets—to create contextual, multilingual, pre-trained datasets; fine-tuned, industry-specific LLMs; and RAG pipelines supported by vector databases. Our zero-distance innovation™ solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster.
Our mission is to bridge the gap between AI creators and industry leaders by bringing best practices in GenAI to unicorn innovators and enterprise customers. We aim to help these organizations unlock significant business value by deploying GenAI at scale, helping to ensure they stay at the forefront of technological advancement and maintain a competitive edge in their respective markets.
About Job
We are seeking a highly motivated PhD intern to join Centific’s Vision AI team for a 3–6 month engagement. This is an applied research role for doctoral candidates who want to move beyond the lab and deploy their expertise directly into a live AI operations program. You will be embedded in a production computer vision system processing real-time video feeds across multiple active detection workflows. You will work alongside senior engineers and ML leads to implement, optimize, and measure AI improvement strategies that ship to production on a daily pipeline cadence. The emphasis is on building and shipping: translating model research into working, measurable systems that improve real-world detection performance.
Across all focus areas, the intern will monitor daily pipeline KPIs, contribute to post-run analysis, and document implementation decisions in the team ops ledger. In addition, each intern is assigned to two of the following five focus tracks for the duration of the engagement:
- Track 1 — NVIDIA VSS / DeepStream Optimization: Optimize real-time RTSP feed processing and multi-stream batching; configure and tune object tracking to eliminate re-detection false positives and reduce hallucination rates across active surveillance workflows.
- Track 2 — Teacher → Student Distillation: Implement and run distillation cycles that compress 20+ epoch full retrains into 3-epoch student passes; maintain and improve three student model variants with daily pipeline integration and performance validation.
- Track 3 — SEAL Drift Detection & Auto-Correction: Monitor metrics against a rolling baseline to detect distribution shift; execute targeted fine-tuning or short retraining cycles when drift thresholds are crossed, and systematically reduce recurring false positives.
- Track 4 — Self-Distillation & Confidence Calibration: Run self-distillation refinement passes where student models act as their own teachers; apply consistency confidence calibration to narrow confidence intervals, and reduce overconfidence-driven hallucinations.
- Track 5 — Student ↔ Student Weighted Peer Learning: Run confidence-weighted ensemble computations across three student model variants, monitor inter-student disagreement rates, route high-disagreement frames to the human review queue, and conduct weekly contribution audits to ensure balanced peer learning and prevent teacher-bias propagation.
- Currently enrolled in a PhD program in Computer Science, Electrical Engineering, Applied Mathematics, or a closely related field, with a strong orientation toward applied systems and implementation.
- Deep expertise in computer vision fundamentals: convolutional neural networks, transformers (ViT, DETR), and generative models.
- Strong proficiency in Python and deep learning frameworks including PyTorch and/or TensorFlow.
- Hands-on experience with large-scale dataset processing, annotation workflows, or benchmark construction.
- Solid understanding of model training techniques: transfer learning, self-supervised learning, and fine-tuning strategies.
- Strong implementation skills: ability to take a model research concept and produce a working, measurable system quickly; comfort operating in a daily-cadence production pipeline environment.
- Clear written and verbal communication skills; ability to document implementation decisions, pipeline changes, and performance results for both technical and operational audiences.
- Hands-on experience with NVIDIA DeepStream, TensorRT, or TAO Toolkit; familiarity with RTSP stream processing, multi-stream batching, or edge inference optimization.
- Familiarity with 3D vision, point cloud processing, or LiDAR-visual fusion (particularly in outdoor surveillance or autonomous systems contexts).
- Practical experience with knowledge distillation (Teacher → Student, self-distillation, or peer learning), confidence calibration techniques (temperature scaling, isotonic regression, ECE measurement), or active learning / distribution shift detection.
- Prior industry internship experience in AI/ML research or data-centric AI.
- Prior experience contributing to a production AI pipeline or daily model training cadence; comfort reading and interpreting confusion matrices, F1/Precision/Recall trends, and confidence interval dashboards as operational signals.
- Experience with MLOps tools (Weights & Biases, MLflow, DVC) and cloud platforms (AWS, GCP, or Azure).
- Hands-on ownership of a live, production AI system processing real-world surveillance data daily — with measurable KPI targets, real drift events, and deployment decisions that matter.
- Mentorship from senior ML engineers and AI leads with deep expertise in deployed Vision AI systems, model distillation, drift correction, and edge inference optimization.
- Direct contribution to measurable performance improvements — reductions in hallucination rate, narrowing of confidence intervals, and F1 score gains — on a live public safety AI program.
- Access to proprietary datasets, annotation infrastructure, and compute resources for research experiments.
- Attribution and credit in Centific’s IP Vault for implemented strategies and methodology contributions, with potential for technical blog posts, internal white papers, or co-authorship on applied research artifacts arising from the program.
- Consideration for full-time opportunities upon PhD completion based on performance.
Compensation: Competitive hourly stipend commensurate with PhD program year and experience
Location: Remote-first; hybrid options available at select office locations
Start Date: Flexible — rolling admissions, positions filled as qualified candidates are identified
Duration: 3–6 months, with possibility of extension
Equipment: Laptop and cloud compute credits provided
$50 per hr
Centific is an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, citizenship status, age, mental or physical disability, medical condition, sex (including pregnancy), gender identity or expression, sexual orientation, marital status, familial status, veteran status, or any other characteristic protected by applicable law. We consider qualified applicants regardless of criminal histories, consistent with legal requirements.