Technical Operations Lead

 Posted 2 hours ago
     
10+ years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Lead technical operations for a cloud-native AWS data and AI platform, focusing on reliability, observability, and incident response. Coordinate with SRE leads to automate workflows and ensure production readiness for federal data products.

Why Karsun?

Join Karsun Solutions to grow your career with the company transforming possible for the US Government.

 

At Karsun, collaboration drives our community. We’re committed to building an environment where team members from diverse backgrounds can innovate, learn and grow with us. Here at Karsun, the only limit to your potential is the limit of your curiosity.

 

Join Team Karsun, and Find Your Next!

Summary 

This individual will lead technical operations for a cloud-native (AWS) data and AI platform supporting a federal program; own reliability, observability, incident response, platform engineering, and data-product operationalization.

What You'll Be Doing:

  • Serve as primary technical owner for platform availability, reliability, and operational runbook development for data pipelines, feature stores, model serving, and supporting infrastructure.
  • Work closely with the SRE Lead to design and operationalize SRE practices (SLIs/SLOs/SLAs, error budgets, toil reduction) to transition teams from DevOps to SRE.
  • In collaboration with SRE Lead, build and maintain monitoring, alerting, and observability across data and AI stacks (ETL/ELT, data lakes/warehouses, model training & serving), including metrics, distributed tracing, and centralized logging.
  • Lead incident management: on-call rotations, incident response, RCA, remediation tracking, and continuous improvement.
  • In collaboration with SRE Lead, automate operational workflows (deployments, scaling, recovery) using IaC (Terraform/CloudFormation) and CI/CD pipelines; reduce manual operational toil.
  • Define and enforce runbooks, backup/restore, RTO/RPO, and disaster recovery for data and ML systems.
  • Partner with data product owners, ML engineers, security, and compliance to ensure production readiness, access controls, and federal compliance requirements.
  • Manage capacity planning, cost optimization, and performance tuning of AWS resources for data and ML workloads.
  • Mentor and lead an ops/SRE team; set technical priorities and coordinate cross-functional platform changes.
  • Maintain vendor and third-party integrations and coordinate upgrades/patching under federal change-control processes.
  • Track and report reliability metrics and operational maturity improvements to stakeholders

 

Required Qualifications:

  • 10+ years of directly relevant IT work experience.
  • 7+ years technical operations / platform / SRE experience supporting data-intensive systems; 3+ years in AWS production environments.
  • Deep understanding of data products and product ownership: data lineage, stewardship, SLAs, and consumer contracts.
  • Proven experience operating data platforms: Databricks, Airflow, S3, , Kafka/Kinesis, Airflow.
  • Strong SRE practice knowledge: SLI/SLO design, incident response, runbooks, chaos/failure-mode testing.
  • Hands-on with observability tooling (Prometheus, , Datadog, OpenTelemetry) and log/tracing systems.
  • Familiar with IaC (Terraform or CloudFormation), CI/CD (GitHub Actions/Jenkins/ArgoCD), container orchestration (EKS/Kubernetes), and scripting (Python, Bash).
  • Solid security and compliance experience for federal environments (RBAC, encryption, secrets management).
  • Excellent written and verbal communication; ability to produce clear runbooks, RCA reports, and brief leadership.

Preferred Qualifications:

  • AWS Certified Solutions Architect – Associate (desirable).
  • Prior experience with ML lifecycle/MLOps tooling (SageMaker, Databricks) and feature stores.
  • Experience migrating teams from DevOps to SRE and driving organizational change.
  • Experience with cost optimization and governance of large AWS data/ML workloads.
  • Familiarity with federal program processes, change control, and procurement cycles.
  • Active federal clearance or ability to obtain one.

Things to Know:

Commitment to Non-Discrimination

All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, local, or international law.

 

Salary Range

The proposed salary range for this role is $160,000 to $175,000 USD. The salary range provided is a good faith estimate representative of all experience levels. Karsun considers several factors when extending an offer, including but not limited to, the role, function and associated responsibilities, a candidate’s work experience, location, education/training, and key skills.

 

Third Party Resumes: Karsun does not accept unsolicited resumes through or from search firms or staffing agencies. All unsolicited resumes will be considered the property of Karsun and Karsun will not be obligated to pay a placement fee.

 

Clearance Information

This position requires the eligibility to obtain a security clearance. The Defense Industrial Security Clearance Office (DISCO), an agency of the Department of Defense, handles and adjudicates the security clearance process. More information about Security Clearances can be found on the US Department of State government website: https://www.state.gov/m/ds/clearances/c10978.htm

 

Location

To be considered for this role, you must reside in one of the following states: CA, CO, DC, FL, GA, IL, MD, NJ, NY, NC, OH, OK, PA, SC, TX, VA, WV.

 

Applicants must be authorized to work in the U.S. We may consider candidates currently in H-1B status who are eligible for transfer.

Similar Jobs

See all Remote Others jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Others

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified