1937 Sr DevOps Engineer - Production Support

 Posted a month ago
  
 Brazil
  
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Monitor critical production systems and act as the primary technical responder for live incidents and escalations. Focus on improving platform reliability through automation, runbook refinement, and collaboration with engineering teams.

Senior DevOps Engineer - Production Support (Azure/AKS)

  • Location: Remote from LATAM (100% Remote)
  • Contract Type: Full-time vendor (Contracted directly by Inallmedia.com)
  • Time Zone Alignment: Central Time (CT) ±2 hours

About Inallmedia.com

Inallmedia.com is a global technology and design firm focused on building impactful digital solutions through remote, distributed teams across LATAM. We partner with international clients across industries, providing long-term technical expertise, product innovation, and team augmentation.

For this specific role, you will be contracted directly by Inallmedia.com to support a leading, high-growth sustainable energy and clean technology enterprise based in the USA.

Project Overview

You will join a dynamic engineering squad dedicated to maintaining and optimizing the cloud infrastructure that powers critical clean energy and solar storage solutions across North America. As a Senior DevOps Engineer, you will focus heavily on production reliability, monitoring high-availability cloud systems, and driving incident response.

This role bridges the gap between infrastructure operations and engineering, ensuring the scalability and resilience of next-generation green tech platforms.

Key Responsibilities

  • System Monitoring: Monitor critical production systems—including Azure Kubernetes Service (AKS), microservices, and CI/CD pipelines—using advanced dashboards and proactive alerting.
  • Incident Response: Act as the primary technical responder for live production incidents and Slack escalations, ensuring rapid triage, root-cause identification, and swift resolution.
  • Operational Excellence: Maintain, refine, and improve internal runbooks and standard operating procedures (SOPs) to ensure operational predictability.
  • Deployment Support: Oversee and support deployment activities across both production and non-production environments while strictly adhering to SLAs and corporate response times.
  • Reliability Engineering: Collaborate deeply with core DevOps and software engineering teams to root out recurring systemic issues and elevate overall platform reliability.
  • Automation: Help design and implement smart automation scripts for recurring operational tasks to reduce manual toil.

Must-Have Skills

  • Cloud & Support Experience: 6+ years of proven experience in DevOps, Cloud Infrastructure, or high-stakes Production Support roles.
  • Azure Mastery: A solid, comprehensive understanding of Microsoft Azure fundamentals (specifically Compute, Networking, and Azure Monitoring ecosystems).
  • Kubernetes Expertise: Hands-on operational experience with Kubernetes, specifically Azure Kubernetes Service (AKS) operations, log analysis, and cluster scaling.
  • Observability Tooling: Strong familiarity with modern monitoring and observability tools (such as Azure Monitor, Grafana, Prometheus, or similar).
  • Incident Management: Well-versed in structured incident management, escalation workflows, and working under strict SLA guidelines.
  • Scripting Fluency: Intermediate-to-advanced scripting capabilities using Bash, PowerShell, or Python for task automation.
  • Remote & Agile Mindset: Extensive experience working autonomously in Agile teams within 100% remote environments.
  • Communication: Exceptional verbal and written English skills for seamless daily technical collaboration.

Nice-to-Have Skills

  • Hands-on exposure to building and optimizing CI/CD pipelines (GitHub Actions, Jenkins, etc.).
  • Practical exposure to Infrastructure as Code (IaC) concepts and tools (Terraform, Bicep).
  • Prior experience operating in 24/7 mission-critical or high-availability (HA) infrastructure environments.
  • Familiarity with ITIL frameworks or highly structured enterprise incident management ecosystems.

Time Zone & Collaboration

The role requires close collaboration with teams aligned to Central Time (CT). Full integration with the US-based team during core operational hours is expected, allowing for real-time collaboration and agile synchronization.

Language

All interviews, technical documentation, and daily communication will be conducted exclusively in English.

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in DevOps Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified