Site Reliability Engineer

 Posted 17 hours ago
     
 $75700 - $136K per year
  
2-5 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Design, develop, and operate critical infrastructure services and observability solutions to ensure the reliability and scalability of Akamai Cloud. Drive reliability improvements through automation and collaborate with engineering teams to resolve complex production issues.

Do you enjoy collaborating with teams to solve complex challenges?

Do you have a passion for automation and building systems that scale?

Join our highly skilled Site Reliability Engineering team!

Our team designs, develops, and manages applications and infrastructure that support Akamai Cloud's products and services. Our SRE teams solve reliability, security, and usability at scale for our global fleet while maintaining Akamai's mission at the forefront of what we do: make life better for billions of people, billions of times a day.

Partner with the best

In this role, you will focus on configuration management, IAC, and CI/CD. You will design, develop, and operate infrastructure deployment for the Akamai Cloud.

As a Site Reliability Engineer, you will be responsible for:

  • Designing, developing, testing, and operating critical services that support the reliability, scalability, and performance of our infrastructure.
  • Designing and implementing observability solutions, including monitoring, logging, alerting, and telemetry capabilities, to proactively detect and resolve issues
  • Driving reliability improvements through automation, reducing operational toil and increasing the resilience of engineering processes.
  • Developing technical expertise in IAC systems and serving as a trusted technical resource, mentoring engineers and sharing best practices
  • Collaborating with software engineering, infrastructure, and platform teams to investigate complex production issues, identify root causes, and implement long-term corrective actions.
  • Participating in an on-call rotation and providing leadership during incident response, driving timely service restoration, effective communication, and post-incident improvement efforts.

Do what you love

To be successful in this role you will:

  • Have relevant experience and a Bachelor's degree in Computer Engineering, Computer Science or equivalent
  • Demonstrate experience in a Site Reliability or Software Engineering role, working with large-scale distributed systems.
  • Have experience with Terraform, including module development, state management, workspace design, policy enforcement, and enterprise-scale Infrastructure as Code implementations
  • Have experience managing Infrastructure as Code solutions using tools such as Terraform, SaltStack, Ansible, Chef, Puppet, or similar technologies
  • Have experience with designing, developing, and deploying software and infrastructure at scale in a Linux environment.
  • Have great communication and interpersonal skills

About us

At Akamai, we make life better for billions of people, trillions of times a day.
Whether you're streaming live events, scrolling social media, watching your favorite series, or managing your savings, we're the engine behind the scenes. We provide the world's most distributed platform from Cloud to Edge to help the giants of the digital world work faster and stay more secure, making the internet a better experience for everyone.

Our focus is simple:
Cloud and Edge: Running apps closer to users for instant performance.
Security: Neutralizing threats before they ever reach your data.
Content Delivery: Scaling the world's biggest moments without a glitch.
AI: Enabling our customers to build, secure, and scale AI apps on the world's most distributed cloud platform.

At Akamai, we don't just support the internet; we power and protect it, because behind every great digital experience is a massive hidden challenge. And we're the ones who solve it. When millions of people hit play or pay, Akamai ensures it just works.

Benefits at Akamai: We support your health, well-being, finances, and life beyond work. See our benefits.

FlexBase adapts to your job's needs

Akamai's FlexBase program is yet another way we show our commitment to providing employees with an exceptional workplace experience. It's not about telling employees where to work; it's about supporting employees to do their best work.

We trust our incredible employees to work in ways that suit them best: at home, in an office, or a combination of both.

Connect with us on social and see what life at Akamai is like!

Compensation

Akamai is committed to fair and equitable compensation practices. For US based candidates only - the base salary for this position ranges from $75,700 - $136,300/year; a candidate’s salary is determined by various factors including, but not limited to, relevant work experience, skills, certifications and location. Compensation for candidates outside the US will vary. The compensation package may also include incentive compensation opportunities in the form of annual bonus or incentives, equity awards and an Employee Stock Purchase Plan (ESPP). Akamai provides industry-leading benefits including healthcare, 401K savings plan, company holidays, vacation (in the form of PTO), sick time, family friendly benefits including parental leave and an employee assistance program including a focus on mental and financial wellness; Eligibility requirements apply.

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified