Principal Site Reliability Engineer

 Published 24 days ago
    
 United States
    
 $217,000 - $325,000 per year
Apply Now Please mention DailyRemote when applying

Disclaimer: Before you apply, please make sure the job is legit.

Attempting to apply for jobs might take you off this site to a different website not owned by us. Any consequence as a result for attempting to apply for jobs is strictly at your own risk and we assume no liability.

Get to know Okta and Workforce Identity Cloud

 

At Okta, our motto is "Always On," and nowhere do we embrace that more than in Technical Operations. We strive to build the most reliable and performant systems on the planet through the skillful use of automation. 

Okta Workforce Identity Cloud (WIC) provides easy, secure access for your workforce so you can focus on other strategic priorities—like reducing costs, and doing more for your customers.

Get to know the role

Okta Workforce Identity Cloud (WIC) provides easy, secure access for your workforce so you can focus on other strategic priorities—like reducing costs, and doing more for your customers.

What you’ll be doing

  • Becoming deeply familiar with all the corners of a critical SaaS platform utilized by millions of customers daily.
  • Working to navigate a significant replatforming initiative, seamlessly moving dozens of critical components between container orchestration systems with zero downtime or customer impact.
  • Engaging w/stakeholders across the group to understand component boundaries and dependencies, and act as a guide and coach for your teammates.
  • Driving what’s next for our SDLC - how do we ideate, onboard, operate, and scale microservices and features in a secure, performant, always-on manner?
  • Identifying, understanding, and automating away manual processes through clever code and smart architecture.
  • Supporting a 24x7 online environment as part of a global on-call rotation.
  • Advocating best practices for scalable, reliable, and resilient systems and services across all of WIC engineering.

What you’ll bring to the role

  • 9+ years of experience as a site reliability or platform engineer, preferably in a fast-scaling environment.
  • 3+ years of experience building and operating workloads orchestrated by Kubernetes.
  • Have familiarity with large scale containerised deployments, both microservice and monolithic.
  • Are always willing to go the extra mile: see a problem, fix the problem.
  • Are passionate about encouraging the development of engineering peers and leading by example.
  • Have experience automating, securing, and scaling large-scale services in AWS and/or other cloud providers, under multiple orchestration layers such as Kubernetes, Rancher, or ECS.
  • Have knowledge of CI/CD principles, Linux fundamentals, OS hardening, networking concepts, and Internet protocols.
  • Have strong skills in multiple operational tooling languages such as Python, Rust, or Go.
  • Deeply understand both relational and non-relational datastores, including replication and clustering strategies.

Experience in the following 

  • 8+ years of experience architecting and running complex AWS or other cloud networking infrastructure resources
  • 8+ years of experience leveraging tools such as Ansible, Chef, or Terraform to automate and manage expansive platforms.
  • 3+ years of experience operating Kubernetes-orchestrated workloads at scale.
  • Strong Linux understanding and experience.
  • BS In computer science (or equivalent experience).
  • An active U.S. Government Security Clearance is preferred for this position. Candidates without an active clearance may still be considered depending on qualifications and eligibility to obtain one.

*This position requires the ability to access Impact Level 4 (IL4) data, as defined by the Department of Defense (DoD) Cloud Computing Security Requirements Guide. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.

"This role requires in-person onboarding and travel to our San Francisco, CA HQ office during the first week of employment."

 

#LI-Remote 

#LI-LSS1

 

Okta is an Equal Opportunity Employer.

Okta is rethinking the traditional work environment, providing our employees with the flexibility to be their most creative and successful versions of themselves, no matter where they are located.  We enable a flexible approach to work, meaning for roles where it makes sense, you can work from the office, or from home, regardless of where you live.  Okta invests in the best technologies and provides flexible benefits and collaborative work environments/experiences, empowering employees to work productively in a setting that best and uniquely suits their needs.  Find your place at Okta https://www.okta.com/company/careers/. 

By submitting an application, you agree to the retention of your personal data for consideration for a future position at Okta.  More details about Okta’s privacy practices can be found at: https://www.okta.com/privacy-policy.

Ace Your Job Interview

Read our advice on how to answer the most common interview questions.