Senior Site Reliability Engineer

 Posted 20 hours ago
  
 India
  
⭐ 5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

The role focuses on enhancing the availability, reliability, and performance of Akamai's mapping system through monitoring and analysis. Key duties include defining KPIs, developing proactive monitoring tools, and collaborating with product engineers to ensure scalable system design.

Do you like collaborating across teams to solve complex problems?

Do you enjoy solving large scale distributed systems problems?

Join the Mapping SRE team

The Mapping SRE team oversees and enhances availability, reliability, performance, and change management for Akamai's mapping system. This system routes trillions of daily client requests, managing tens of terabits per second of global content traffic. The team establishes KPIs, improves measurements, monitoring tools, alerts, and resolves intricate production issues effectively.

Partner with the best

A Site Reliability Engineer collaborates with teams to enhance Akamai's Mapping Service performance, availability, and reliability through monitoring and analysis. Key responsibilities include defining KPIs, improving alert systems, refining operational responses, and examining intricate performance concerns.

As a Senior Site Reliability Engineer, you will be responsible for:

  • Monitoring, investigating, and analyzing performance and availability while designing, managing, and tracking product-specific metrics and goals effectively and efficiently.
  • Solving problems and avoid recurrence by developing tools / prototypes to proactively monitor service performance and availability
  • Working closely with product engineers to advocate reliable and scalable system design for support, resilience and reliability
  • Leveraging skills in data analysis, network diagnostics and debugging tools to characterize performance and recommend improvements
  • Collaborating with internal teams to help trouble-shoot and resolve escalations and incidents for our customers

Do what you love

To be successful in this role you will:

  • Hold advanced expertise in Computer Science, Engineering, or a related field through formal education or equivalent professional experience.
  • Demonstrate extensive experience in Site Reliability Engineering or a related role, showcasing expertise and practical knowledge in the field.
  • Demonstrate exceptional ability to analyze complex data, translating findings into actionable insights that support strategic improvement initiatives.
  • Demonstrate expertise with scripting or procedural languages such as Python, Perl, Shell, C/C++, Java, among others.
  • Demonstrate expertise in SQL, analyze data trends to aid decision-making, resolve data integrity issues, and optimize systems effectively.
  • Demonstrate expertise operating within a UNIX/Linux computing environment effectively.
  • Utilize expertise in monitoring, alerting, and logging tools, including platforms like Grafana, to ensure system reliability.
  • Demonstrate expertise in computer networking concepts, Unix/Linux internals, distributed systems, and system design without compromising technical depth or accuracy.

About us

At Akamai, we make life better for billions of people, trillions of times a day.
Whether you're streaming live events, scrolling social media, watching your favorite series, or managing your savings, we're the engine behind the scenes. We provide the world's most distributed platform from Cloud to Edge to help the giants of the digital world work faster and stay more secure, making the internet a better experience for everyone.

Our focus is simple:
Cloud and Edge: Running apps closer to users for instant performance.
Security: Neutralizing threats before they ever reach your data.
Content Delivery: Scaling the world's biggest moments without a glitch.
AI: Enabling our customers to build, secure, and scale AI apps on the world's most distributed cloud platform.

At Akamai, we don't just support the internet; we power and protect it, because behind every great digital experience is a massive hidden challenge. And we're the ones who solve it. When millions of people hit play or pay, Akamai ensures it just works.

Benefits at Akamai: We support your health, well-being, finances, and life beyond work. See our benefits.

FlexBase adapts to your job's needs

Akamai's FlexBase program is yet another way we show our commitment to providing employees with an exceptional workplace experience. It's not about telling employees where to work; it's about supporting employees to do their best work.

We trust our incredible employees to work in ways that suit them best: at home, in an office, or a combination of both.

Connect with us on social and see what life at Akamai is like!

Similar Jobs

See all Remote Software Development jobs β†’

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified