Staff Site Reliability Engineer

 Posted a day ago
     
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Improve and maintain the infrastructure powering the Cloudera Data Platform by defining patterns for IaC, CI/CD, and observability. The role involves optimizing systems to eliminate toil and providing operational support through incident response and on-call rotations.

Business Area:

Engineering

Seniority Level:

Mid-Senior level

Job Description: 

At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry.  Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.

Cloudera is seeking a Staff Site Reliability Engineer (SRE), to improve and maintain the infrastructure which powers Cloudera Data Platform (CDP). You will take part in defining patterns across the organization applying to Infrastructure as Code (IaC), distributed-systems, CI/CD, observability, security and more.

This role is not eligible for Immigration Sponsorship or Relocation.

As a Staff Site Reliability Engineer, you will:

  • Improve reliability and scalability of the platform

  • Build systems to manage platform infrastructure and applications (platform engineering)

  • Optimize existing systems and eliminate toil through simplification and automation

  • Provide operational support and engineering assistance to the whole of Cloudera engineering

  • Monitor availability, latency and overall service health

  • Practice sustainable incident response and blameless postmortems

  • Participate in an on-call rotation

We’re excited about you if you have:

  • Bsc/Msc in related field or equivalent experience

  • Have 5+ years industry experience as an SRE, DevOps or related role

  • Enjoy collaborating and are a strong communicator

  • Strong Linux and systems administration experience

  • Amazon Web Services (AWS) expertise, especially EKS, networking, security and scaling

  • Experience with container technology and microservices architectures, including Kubernetes

  • Experience with observability, logging and monitoring tools

  • Experience with Terraform and related technologies

You may also have:

  • Experience with CI/CD tools, such as Spinnaker, Jenkins, Flux CD, Argo CD

  • Experience with GitOps and Git-based automation

  • Programming experience in Python, Go or similar languages

  • Experience with compliance programs such as SOC, FedRAMP, HITRUST CSF

  • Experience with database systems, including Postgres and MySQL

  • Experience with Microsoft Azure or Google Cloud Platform

What you can expect from us:

  • Generous PTO Policy 

  • Support work life balance with Unplugged Days

  • Flexible WFH Policy 

  • Mental & Physical Wellness programs 

  • Phone and Internet Reimbursement program 

  • Access to Continued Career Development 

  • Comprehensive Benefits and Competitive Packages 

  • Paid Volunteer Time

  • Employee Resource Groups

EEO/VEVRAA

#LI-REMOTE

#LI-AC1

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified