Strattmont

Computer Vision & AI Lead

Posted a month ago

Saudi Arabia, United Arab Emirates, United Kingdom, United States

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Lead the SiteGuard computer vision product end-to-end, managing the development of detection models, edge and cloud agents, and MLOps pipelines. Build and mentor a high-ownership engineering team while collaborating cross-functionally to ensure safety compliance on construction sites.

About the company:

The Company is a construction-tech company headquartered in Riyadh, building the connected-worker platform for large-scale construction and industrial sites. Our hardware, smart helmets, anchors, and gateways, feeds a real-time SaaS platform that gives site teams worker location, safety compliance, automated mustering, and productivity analytics. We operate across major projects in the GCC and are growing fast. The engineering team is small, senior, and high-ownership. We are building for the long term.

About the Role

SiteGuard is computer-vision safety product - detecting PPE non-compliance, hazards, and unsafe behavior from site cameras in real time. The stack runs both on-site (edge agents close to the camera) and in the cloud, with consistent detection output either way. As SiteGuard Squad Lead, you own all of it: detection models, camera agents, evaluation and MLOps pipelines, and the product surfaces that turn raw detections into safety actions for HSE managers and site supervisors.

The world of CV and AI has moved fast. Foundation models, vision-language models (VLMs), and generative AI have changed what is achievable and how quickly. You have kept pace: you know when to use a classical detector, when a VLM gives you better results with less annotation cost, and when synthetic data generation can close a gap that real-world collection cannot. You bring both the classical CV foundations and the modern AI stack.

This is a senior engineering leadership role. You have led teams before, hired engineers, managed performance, built culture, and you are a serious daily user of AI coding tools. You lead a cross-disciplinary team of Senior engineers with high autonomy and direct product impact.

What Success Looks Like - First 6 Months

You own the SiteGuard stack end-to-end and can lead a technical walkthrough of any layer models, edge agent, cloud agent, evaluation pipeline, product surface.
The golden labeled dataset is established as the single source of truth, and regression gates are in place so no model, prompt, or pipeline change ships without a measured quality bar.
Both the on-site edge agent and the cloud camera agent produce consistent detection output. Latency, cost, and accuracy are tracked in production. You know when something degrades before a customer does.
The team has a clear position on the modern AI stack: which detection approaches fit which deployment contexts, and where foundation models or VLMs replace or augment classical pipelines.
Stakeholders in Hardware, Field Engineering, and Product trust you as the person accountable for SiteGuard - its accuracy, its reliability, and its direction.

Key Responsibilities

1. Build & Lead the Team

Own hiring end-to-end for SiteGuard: define the bar across ML, backend, and frontend disciplines, run the process, and make decisions. You have hired senior engineers before and know what good looks like.
Manage performance directly - set clear expectations, give continuous feedback, run meaningful reviews, and act decisively when someone is not meeting the bar.
Build a high-ownership engineering culture where engineers take initiative, write their own issues, and feel accountable for product outcomes, not just task completion.
Mentor engineers at every level - from onboarding new contributors to developing senior engineers. Your track record includes engineers who grew significantly under your leadership.
Own the squad's delivery: run agile ceremonies, plan sprints, and partner with Product to translate the SiteGuard roadmap into shippable increments.

2. Computer Vision & AI Model Development

SiteGuard's detection capabilities must stay current with the AI model landscape, not lag behind it. You make the architectural calls on which approaches fit which contexts.

Own the full lifecycle of SiteGuard's detection capabilities - PPE compliance, hazard detection, unsafe-behavior recognition - from problem framing through training, evaluation, and production deployment.
Lead the strategic use of modern architectures and foundation models: transformer-based detectors (RT-DETR, YOLOv10/11, Co-DETR) alongside classical YOLO-family models; zero/few-shot approaches via CLIP, DINOv2, GroundingDINO, and SAM 2; and VLMs (Gemini Vision, GPT-4V, LLaVA, Qwen-VL) for scene understanding and incident reasoning where they outperform purpose-built detectors.
Lead the use of generative AI to augment training data - diffusion-based synthetic image generation (ControlNet, Stable Diffusion) for rare PPE violations, lighting conditions, and site environments that real-world collection cannot cover economically.
Define and drive model-quality targets (precision/recall, false-alarm rates) and the retraining loops that sustain them as site conditions change.
Oversee dataset strategy: collection, AI-assisted annotation (using SAM, CLIP, or VLMs as labeling tools), curation, and governance of site imagery and video.

3. Edge & Cloud Camera Agents

Lead development of the on-site edge agent that runs inference close to the camera - optimizing transformer-based and classical models for constrained hardware (quantization, INT8/FP16, TensorRT, ONNX, batching, CUDA/NPU accelerators).
Lead development of cloud camera agents for cloud-based deployments, ensuring the same consistent detection output as on-site agents regardless of deployment type.
Engineer for the realities of the field: on-prem gateway deployment, intermittent connectivity, store-and-forward, and graceful degradation.
Ensure low-latency, reliable detection-to-alert pipelines from camera to platform, including on-device pre-filtering before cloud VLM calls where cost and latency demand it.

4. Evaluation, MLOps & Data Pipelines

Own the golden labeled dataset as the single source of truth for evaluation, fine-tuning, and production monitoring. Run production reviewer signals (labels, corrections) back into the dataset continuously.
Design evaluation metrics for both classical detector output and VLM-generated detections (precision/recall, false-positive control, human-agreement). Run every model, prompt, or schema change as a regression test against the golden set before release, no change ships without a measured quality bar.
Architect data-extraction, AI-assisted annotation, and training pipelines that ensure reproducibility and versioning of datasets and models (experiment tracking, model registries, dataset versioning).
Implement CI for models and code: automated retraining/evaluation, LLM-as-judge patterns for open-ended detection outputs, and production monitoring for model drift, cost, and accuracy degradation.
Build deterministic post-processing guardrails over model output - domain/OSHA rule filters, confidence calibration, audit trails - so the product behaves predictably even when model output varies.
Raise engineering maturity across SiteGuard repositories: test coverage, CI gating, and coverage reporting.

5. Product, Quality & Privacy

Own the SiteGuard product surfaces (web dashboards and frontends) and the APIs that deliver detected violations and safety events into the core Company's platform.
Ensure alerting, reporting, and analytics turn raw detections into clear, prioritized actions - including LLM-generated incident summaries and natural-language search over safety event history.
Enforce code review, testing, and quality standards across model and application code.
Champion privacy-by-design for video and personal data - anonymization, access controls, retention limits, and responsible use of footage in compliance with Company and customer requirements. Implement responsible-AI safeguards on all AI-generated outputs: confidence thresholds, human-in-the-loop review for high-severity alerts, and audit trails.

6. Cross-functional Collaboration & Innovation

Collaborate with Hardware/camera, DevOps, Field Engineering, and customer teams to validate SiteGuard in real site conditions and incorporate feedback.
Lead R&D into new detection capabilities - multi-modal models combining video and sensor data, behavioral analysis, crowd analytics - and evaluate emerging CV/AI approaches with honest, grounded judgment.
Stay current with the fast-moving foundation-model and VLM ecosystem and translate new research into concrete roadmap decisions for the squad.

Required Qualifications

Experience

6+ years of software/ML engineering experience with a clear progression from senior IC to engineering leadership.
3+ years directly managing engineers - leading teams, owning hiring, running performance cycles. Mentoring is not the same as managing; this role requires the latter.
A track record of hiring: you have built or significantly grown an engineering team and made independent hiring decisions at the senior engineer level and above.
Proven experience shipping production systems built on vision/multimodal foundation models (VLMs/LLMs via cloud APIs) - owning quality, latency, and cost from prototype to scale.
Hands-on experience operating high-throughput, asynchronous video/media processing pipelines in production.

Technical Depth & Stack Flexibility

We care about depth of engineering thinking, not tool loyalty. The examples below indicate the class of tool - not necessarily the exact one you have used.

Strong Python (async-first: asyncio, FastAPI or equivalent); production service design with a focus on reliability and observability.
Hands-on with multimodal/VLM APIs - Gemini/Vertex AI, OpenAI, or Anthropic equivalents: prompt engineering, structured/JSON-schema-constrained output, context/caching, and per-model parameter tuning.
Computer-vision foundations: object detection and segmentation across both classical architectures (YOLO-family) and modern transformer-based detectors (RT-DETR, Co-DETR, GroundingDINO); video frame extraction and handling (OpenCV/FFmpeg); spatial reasoning over model output.
Foundation models and zero/few-shot approaches: CLIP, DINOv2, SAM 2 for annotation assistance and detection; VLMs for scene understanding and incident reasoning.
Edge inference optimization: ONNX, TensorRT, quantization (INT8/FP16), deployment to constrained hardware (Jetson, Hailo, or equivalent).
Distributed pipeline design: message brokers, relational databases with async ORM and migration tooling, object storage - comfortable across cloud providers (GCP, AWS, or Azure).
MLOps stack: experiment tracking, model registries, dataset versioning, and CI pipelines for model evaluation.

AI Tooling - Daily Practice

Primary daily experience with Claude Code and Codex - used in real engineering work, with formed opinions about when to trust their output and when not to.
Current awareness of the AI model landscape: practical differences between frontier models (Gemini, GPT-4o, Claude, DeepSeek) for vision tasks, code generation, and structured output.
Tracks AI trends actively - model releases, VLM capabilities, agentic framework developments - and translates this into concrete, grounded team guidance.

Communication

Exceptional written and spoken English. This is a hard requirement. You write clearly and precisely - design documents, evaluation reports, and stakeholder updates are well-structured and unambiguous. You can explain a model's failure mode to an HSE manager and an architecture decision to an engineer with equal clarity.

Preferred Qualifications

Experience with safety, surveillance, or video analytics in industrial or construction environments. OSHA/EHS domain knowledge is a strong plus.
Experience with synthetic data generation pipelines (ControlNet, Stable Diffusion) for computer-vision training data augmentation.
LLMOps / observability for model-backed services: tracing model calls, output monitoring, A/B testing of prompts and schemas.
Agentic frameworks (LangChain, LlamaIndex, AutoGen, or equivalent) applied to safety workflows or multi-step incident management.
Comfort working config-over-code for multi-tenant rollouts (per-project model and prompt configuration).
Familiarity with on-prem/edge deployment, gateways, and operating under intermittent connectivity.

Leadership & Soft Skills

Experienced people manager: you have had difficult performance conversations, managed out engineers who were not meeting the bar, and done so with fairness and directness.
Research-oriented curiosity balanced with production pragmatism - you read papers, run experiments, and ship to production. You distinguish AI approaches that deliver real site-safety value from impressive benchmarks that do not survive the field.
Excellent written and spoken English - able to translate model behavior, failure modes, and confidence levels to non-technical stakeholders, including explaining why an AI made a specific safety call.
Strategic, outcome-driven thinking: makes technology decisions based on product value and field reliability, not novelty.
Comfort operating in a fast-paced, evolving environment where both site priorities and the AI landscape shift quickly.

What We Offer

Competitive salary, performance bonus, and equity participation.
High-autonomy role with direct product and company impact - you are building a safety AI product from the ground up, not maintaining an inherited codebase.
A small, senior engineering team where your decisions matter and your name is on the architecture.
Relocation support for candidates joining from outside KSA.
Health insurance, annual flights, and standard Company benefits package.

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now