Senior Site Reliability Engineer

 Posted 2 hours ago
     
10+ years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Own and evolve the reliability, security, and observability of the cloud platform using an AI-native mindset. Automate infrastructure tasks, manage production incidents, and ensure operational excellence across the AWS environment.
Company Overview:
Global Technology Services is a rapidly expanding organization situated in Medellín, Colombia. We pride ourselves on possessing one of the most influential networks within software development and IT services for the entertainment, financial, and logistics sectors. Our corporate projections offer a multitude of opportunities for professionals to elevate their careers and experience substantial growth. Joining our team means engaging with expansive engineering teams across Latin America, Philippines and the United States, contributing to cutting-edge developments in multiple industries.

 

Position Title: Senior Site Reliability Engineer 
Location: Remote-US

 

What you will be doing:
We are seeking a highly experienced Senior Site Reliability Engineer to own and evolve the reliability, security, observability, and operational maturity of our cloud platform. This is not a traditional SRE role. We are looking for an engineer who operates with an AI-native mindset and uses AI as a core operational force multiplier across infrastructure, incident response, automation, compliance, and operational excellence.
Required Skills & Experience
To excel in this role, you should possess:
AI-Native SRE Operations (Hard Requirement)
  • Expert use of AI tools and agentic workflows to automate infrastructure and SRE tasks.
  • Hands-on experience using AI for Terraform development, incident triage, log analysis, runbook creation, postmortems, operational automation, CI/CD pipeline generation, and reducing repetitive operational work.
  • Strong understanding of AI capabilities, limitations, and necessary validation processes.
Ability to clearly articulate AI workflows, tooling choices, operational safeguards, and production outcomes.
Cloud Infrastructure & AWS (Hard Requirement)
  • 10+ years managing production infrastructure for SaaS platforms, including 5+ years of senior AWS ownership.
  • Deep expertise with AWS services such as ECS, VPC, IAM, RDS, S3, CloudFront, Route53, Lambda, API Gateway, CloudWatch, Secrets Manager, and related security and governance services.
  • Advanced Terraform experience managing multi-account environments, infrastructure state, drift remediation, and dependency management.
  • Advanced Terraform experience managing multi-account, multi-workspace infrastructure
  • Strong understanding of: provider versioning, state management, drift detection and remediation, dependency management, infrastructure blast radius analysis
  • Proven experience resolving production infrastructure drift safely
  • Significant experience leading production incidents as the accountable owner
  • Ability to operate calmly and effectively during high-severity outages
  • Proven experience authoring detailed postmortems and operational remediation plans
  • Strong understanding of operational risk management and production recovery procedures
Observability & Monitoring
  • Proven experience leading production incidents, driving root-cause analysis, and creating remediation plans.
  • Strong background in observability, monitoring, logging, distributed tracing, and alerting using tools such as Grafana.
  • Experience owning CI/CD pipelines, deployment strategies, infrastructure automation, and operational workflows.
Systems, Security & Compliance
  • Strong Linux administration, containerization (Docker), networking, and scripting skills.
  • Experience with security best practices, identity management (SAML, OIDC, SCIM), and compliance frameworks such as SOC 2, ISO 27001, HIPAA, or PCI.
  • Comfortable working directly with auditors and maintaining compliance controls.
Nice to Have:
  • Experience supporting Spring Boot or JVM-based systems in production
  • Experience with runtime security or EDR tooling such as Falco
  • Experience automating joiner/mover/leaver identity workflows using SCIM and IdP tooling
  • AWS certifications including:
AWS Solutions Architect Professional
AWS DevOps Engineer Professional
AWS Security Specialty
  • Ability to read and debug Kotlin or Java backend services from an SRE perspective
Soft Skills:
  • Excellent verbal and written communication, able to convey ideas clearly.
  • Highly autonomous and proactive, taking ownership of tasks.
  • Adaptable to fast-paced, dynamic work environments.
  • Responsive and reliable across channels, including email and Slack, consistently delivering results.
  • Able to add immediate value to the client, contributing effectively from the first week.
  • React/NodeJS/Backstage developer experience 
  • MuleSoft API Management experience
Why you will love GTS:
  • Join a powerful tech workforce and help us change the world through technology
  • Professional development opportunities with international customers
  • Collaborative work environment
  • Career path and mentorship programs that will lead to new levels.
Join Lean Tech and contribute to shaping the data landscape within a dynamic and
growing organization. Your skills will be honed, and your contributions will play a vital
role in our continued success. Lean Tech is an equal opportunity employer. We
celebrate diversity and are committed to creating an inclusive environment for all
employees.

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified