Site Reliability Engineer

 Posted an hour ago
  
 India
  
2-5 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Ensure the availability and performance of the customer-facing platform by managing highly available Windows and Linux systems. Automate workflows using infrastructure as code and maintain disaster recovery plans and security best practices.

Job Summary 

As a Site Reliability Engineer, you will play a critical role in ensuring the availability and performance of our customer-facing platform. You will work closely with DevOps, DBA, and Development teams to provision and maintain infrastructure, deploy and monitor our applications, and automate workflows. Your contributions will have a direct impact on customer satisfaction and overall user experience. 

 

Responsibilities and Deliverables 

  • Manage, monitor, and maintain highly available systems (Windows and Linux) 
  • Analyze metrics and trends to ensure performance and rapid scalability. 
  • Address routine service requests while identifying ways to automate and simplify. 
  • Create infrastructure as code using Terraform, ARM Templates, Cloud Formation. 
  • Maintain data backups and disaster recovery plans. 
  • Adhere to security best practices through all stages of the software development lifecycle 
  • Follow and champion ITIL best practices and standards. 
     

 Organizational Alignment 

  • Reports to the Senior SRE Manager 
  • This role involves close collaboration with DevOps, DBA, and security teams. 
     

Technical Proficiencies 

  • Hands-on experience with AWS is a must-have. 
  • Proficiency analyzing application, IIS, system, security logs, and CloudTrail events. 
  • Experience with CI/CD tools such as Jenkins and GitHub Actions 
  • Experience maintaining and administering Windows, Linux, and Kubernetes. 
  • Experience in automation using scripting languages such as PowerShell, Bash, or Python. 
  • Good understanding of networking concepts (VPC, subnet, private link, peering). 
  • Familiarity with configuration management using Ansible, Azure Automation or similar. 
  • Familiarity with observability tools such as New Relic, AppDynamics, or DataDog. 

 
Experience 

  • 3+ years of experience in SRE or System Administration role. 
  • Demonstrated ability building and supporting high availability Windows/Linux servers. 
  • 2+ years of experience working with cloud technologies including AWS, Azure. 
  • Comfortable using Scrum, Kanban, or Lean methodologies. 

 

Education 

  • Bachelor’s Degree or College Diploma in Computer Science, Information Systems, or equivalent experience. 

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Site Reliability Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified