Arctiq

Senior Site Reliability Engineer

Posted a month ago

United States

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Architect the reliability strategy for large-scale government systems by implementing SRE frameworks and SLO-based management. Lead strategic automation, incident command for major outages, and the integration of security-as-code within DevSecOps pipelines.

Arctiq is a global, intelligence-driven technology services company delivering professional and managed services across Hybrid Cloud Infrastructure, Networking & Connected Experiences, Cybersecurity, Data & AI, Autonomous Operations & Intelligence, and Enterprise Service Management. We help organizations operate, secure, and modernize complex environments by unifying infrastructure, networking, data, security, automation, and observability under a single, integrated operating model. Our work focuses on helping customers reduce operational friction, improve resilience, and make better, faster decisions as their environments evolve. Arctiq builds on decades of industry expertise and a customer-centric ethos to deliver exceptional value to clients across diverse industries.

The Senior Site Reliability Engineer is a technical leader responsible for architecting the reliability strategy for large-scale, distributed government systems. You will lead the implementation of the SRE framework, driving the adoption of SLO-based management and advanced automation. As a subject matter expert, you will mentor mid-level engineers and interface with government stakeholders to ensure system resilience and performance meet mission requirements.

Key Responsibilities

Reliability Architecture: Define the strategy for Service Level Objectives (SLOs) and Error Budgets. Design complex telemetry pipelines for full-stack observability.
Strategic Automation: Design and govern the enterprise Infrastructure as Code (IaC) standards. Develop custom tooling to automate complex recovery procedures and system scaling.
Incident Command: Act as the Incident Commander for major system outages, leading the technical response and directing the Root Cause Analysis (RCA) process.
Security & Compliance: Lead the integration of security-as-code within DevSecOps pipelines, ensuring full compliance with RMF and NIST 800-53 standards.
Mentorship: Provide technical guidance and mentorship to Mid-Level SREs and developers, fostering a culture of reliability across the organization.

Required Qualifications

7+ years of experience in SRE or DevOps, with significant experience in distributed systems.
Expertise in Go, Python, or Java and advanced knowledge of Linux internals.
Extensive experience managing production Kubernetes environments and complex cloud architectures.
Proven track record of defining and meeting SLOs for high-availability systems.
Experience navigating government Risk Management Framework (RMF) processes.
Education: Bachelor’s or Master’s degree in Computer Science or Engineering.
Certifications: CKA (Certified Kubernetes Administrator) and industry observability certification preferred

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

Arctiq

Senior Site Reliability Engineer

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

OSS PASSIVE PROBING SOLUTION ENGINEER - PuneVOIS

Data Collection Systems Developer - Remote

26.05 NDA, renewable energy assets | AI Developer

DevOps Engineer

Benefits Operations Administrator (BOA)

Senior Full Stack Engineer ID67841

Arctiq

Senior Site Reliability Engineer

AI Summary

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

OSS PASSIVE PROBING SOLUTION ENGINEER - PuneVOIS

Data Collection Systems Developer - Remote

26.05 NDA, renewable energy assets | AI Developer

DevOps Engineer

Benefits Operations Administrator (BOA)

Senior Full Stack Engineer ID67841

Personalize your Remote Job Search in 3 Easy Steps!