Senior Site Reliability Engineer

 Posted 20 hours ago
     
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Define and maintain reliability metrics while leading incident response and automating operational tasks. Design and operate containerized workloads and observability systems to ensure high availability and scalability.
  • Define and maintain SLIs/SLOs, monitor alignment and error budget usage
  • Lead incident response and postmortems, implement corrective measures
  • Automate operations tasks via tooling (e.g. auto-remediation, scaling rules)
  • Build, improve, and maintain CI/CD pipelines, canary deployments, blue/green strategies
  • Lead technical discussions with customers to align on reliability, scalability, and performance requirements
  • Drive continuous platform improvements across the service lifecycle, including architecture, monitoring, and operational processes
    Implement and extend observability systems (metrics, tracing, log aggregation)
  • Optimize performance and cost by tuning cloud services, autoscaling, resource rightsizing
  • Design, deploy, and operate containerized workloads using Docker and Kubernetes in production environments
  • Collaborate with dev teams to integrate resilience patterns (circuit breakers, bulkheading)
  • Participate in architecture discussions around high availability, disaster recovery
  • Mentor mid and junior SREs; conduct reliability design reviews
  • 5–8 years of experience in a reliability or operations role
  • Cloud-agnostic certification: Terraform Associate, Certified Kubernetes Administrator (CKA), or SRE Foundation
  • Cloud provider certification: Professional-level certification in AWS (Solutions Architect), Azure (Solutions Architect Expert), GCP (Professional Cloud Architect), or Oracle Cloud (Architect Professional)
  • Solid coding skills (Python, Go, or equivalent)
  • Experience with IaC, CI/CD pipelines, and monitoring/observability stacks (Prometheus, Grafana, OpenTelemetry, ELK)
  • Comfortable with observability stacks (Prometheus, Grafana, OpenTelemetry, ELK, Jaeger)
  • Experience working in distributed systems and production scale services
     

Nice-to-have Skills

  • Exposure to multi-cloud data replication or cross-cloud networks
  • Experience with chaos engineering or fault injection

Datavail’s Team of Oracle Experts Can Save You Time and Money

As an Oracle Platinum Partner with 17 specializations, we have extensive experience with everything Oracle. Our experts have an average of 16 years of experience. They’ve overcome every obstacle in helping clients manage everything from databases, BI analytics, reporting, migrations, and upgrades to monitoring and overall data management.

You can free up your IT resources to focus on growing your business rather than fighting fires. Our Oracle experts can guide you through strategic initiatives or support routine database management.


Datavail’s Comprehensive Oracle Database Services

Datavail offers Oracle consulting services that allow you to take advantage of all the features of the Oracle database. We can also assist you in designing, implementing, and managing a wide range of Oracle applications.

Oracle Database Managed Services

Datavail’s business focuses on helping you use your data to drive business results through cost-saving services. The success of your business depends on how well you understand and manage your data. Our Oracle managed cloud services give you the power to unleash your organization’s potential. We provide comprehensive and technically advanced support for Oracle installations to ensure that your databases are safe, secure, and managed with the utmost level of care.

Our delivery performance in data management leads the industry. We offer highly trained Oracle database administrators via a 24×7, always on, always available, global delivery model. Datavail’s flexible and client focused services always add value to your organization. Our Oracle database managed services and products include:

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified