Senior Site Reliability Engineer

 Posted a month ago
     
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

The Senior Site Reliability Engineer will architect and manage scalable cloud infrastructure while enforcing SRE best practices across the organization. They will also lead incident management, refine monitoring systems, and mentor junior team members to ensure high system availability.

< Remote - United States >

Job Description:
Stability AI’s Engineering Operations team is looking for a Senior Site Reliability Engineer (SRE) to join our growing team and play a pivotal role in improving and shaping our cloud infrastructure. The person will closely work with engineering, IT, security, and product teams to drive innovation and reliability in an evolving environment. Candidates should have the initiative to build and improve a maturing cloud landscape.

Responsibilities:

  • Developing and enforcing SRE best practices and standards across the organization.
  • Architecting and managing scalable systems in AWS and other cloud environments, focusing on high availability and resilience.
  • Implementing and maintaining infrastructure as code using Terraform.
  • Setting up and refining monitoring, logging, and alerting systems.
  • Driving incident management and root cause analysis to improve system reliability.
  • Championing SRE principles and mentoring junior team members.

Qualifications:

  • Collaborating with development teams to enhance CI/CD pipelines.
  • Experience scaling resource intensive systems, be it storage, networking, or compute.
  • Knowledge and experience with Kubernetes or other container scaling solutions
  • Background in software development or automation scripting.
  • Knowledge and experience with Grafana, ELK stack, or similar tools.
  • Cloud security experience.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

 

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified