Senior Site Reliability Engineer (Linux Performance) - Remote

 Posted 13 hours ago
  
 Poland
  
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

The role involves identifying performance bottlenecks and developing automation tools to optimize the Akamai Cloud platform. Responsibilities include designing observability infrastructure and collaborating with kernel and hardware teams to ensure system performance.

Join our highly skilled Site Reliability Engineering team

Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We do this while maintaining Akamai's mission at the forefront of what we do: make life better for billions of people, billions of times a day.

Partner with the best

As a Senior Site Reliability Engineer II in the Virtualization & Host Platforms (VHP) Linux Performance team, you will be at the forefront of Akamai Cloud platform software and hardware technologies. Our team is responsible for identifying performance bottlenecks and creating the supporting tooling to aid in these investigations. We work with all physical, virtual and containerized Linux platform software supporting the Akamai Cloud. We collaborate closely with the kernel and hardware teams to ensure newer kernels and operating systems are performant and qualified on new server builds.

You will have opportunities to write software automating our processes, help diagnose and resolve challenging production issues, advance initiatives to improve resource utilization and customer experience, and learn from a highly skilled engineering team. This position will require creative thinking combined with deep domain expertise in the areas of Linux systems engineering, performance tuning, hardware interfaces, systems administration, and configuration management.

​​As a Senior Site Reliability Engineer, you will be responsible for:

  • Developing, testing, and distributing changes to automation, software, services, and tools the VHP team is responsible for.
  • Designing and implementing enhancements to VHP observability infrastructure in order to identify and correct problems before they impact our customers.
  • Comfortable working in new tooling, code and environments and automating what’s possible.
  • Create supporting tooling using Ansible or Profiling to aide in performance investigations.
  • Developing subject matter expertise in various components across our Compute environment.
  • Collaborating with our support, operations and engineering teams to investigate and troubleshoot complex problems
  • Participating in on-call rotations, guiding restoration and repair of service-impacting issues

Do what you love

To be successful in this role you will:

  • Possess expert level experience in Linux internals, system administration, and a deep understanding of underlying hardware
  • Possess advanced level experience with the Linux kernel, OS, and optimization of their configurations for KVM/QEMU virtualization
  • Possess advanced level experience with designing, developing, and deploying software and infrastructure at scale
  • Advanced level experience in a Development or SRE role, preferably working with large scale distributed systems.
  • Experience with tools like SaltStack, Ansible, Chef, Puppet, or Kubernetes for managing infrastructure at scale
  • Have great communication and collaborative skills.
  • Have relevant experience and a Bachelor's degree in Computer Engineering, Computer Science or equivalent.

About us

At Akamai, we make life better for billions of people, trillions of times a day.
Whether you're streaming live events, scrolling social media, watching your favorite series, or managing your savings, we're the engine behind the scenes. We provide the world's most distributed platform from Cloud to Edge to help the giants of the digital world work faster and stay more secure, making the internet a better experience for everyone.

Our focus is simple:
Cloud and Edge: Running apps closer to users for instant performance.
Security: Neutralizing threats before they ever reach your data.
Content Delivery: Scaling the world's biggest moments without a glitch.
AI: Enabling our customers to build, secure, and scale AI apps on the world's most distributed cloud platform.

At Akamai, we don't just support the internet; we power and protect it, because behind every great digital experience is a massive hidden challenge. And we're the ones who solve it. When millions of people hit play or pay, Akamai ensures it just works.

Benefits at Akamai: We support your health, well-being, finances, and life beyond work. See our benefits.

FlexBase adapts to your job's needs

Akamai's FlexBase program is yet another way we show our commitment to providing employees with an exceptional workplace experience. It's not about telling employees where to work; it's about supporting employees to do their best work.

We trust our incredible employees to work in ways that suit them best: at home, in an office, or a combination of both.

 

Connect with us on social and see what life at Akamai is like!

     

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified