Please mention DailyRemote when applying
Job Requisition ID #
Position Overview
Want to help make a better world? As a Senior Site Reliability Engineer at Autodesk, you can help us build and operate reliable, secure, and scalable cloud services for Autodesk GovCloud products.
As part of a new SRE team supporting Autodesk GovCloud, you will have a unique opportunity to help shape how Autodesk deploys, runs, and improves production services in restricted cloud environments. This is a foundational role where you will help establish the operating model, reliability practices, automation, and engineering standards needed to support critical customer-facing services.
You will combine software engineering and production operations to deploy, run, monitor, improve, and automate Autodesk services in GovCloud. You will partner closely with product engineering, security, compliance, platform, and infrastructure teams to ensure services are reliable, scalable, secure, and ready for production.
The ideal candidate has deep experience operating production systems at scale, an automation-first mindset, and the ability to improve reliability through engineering practices such as SLOs/SLIs, production readiness, incident management, observability, resilience testing, and toil reduction. Success in this role requires strong technical judgment, a customer-focused mindset, and a passion for using software engineering to solve operational problems at scale.
In accordance with GovCloud Cloud Service Provider Security Requirements, this role must be performed by U.S. Citizens. Employment is contingent upon meeting all applicable government security and eligibility requirements, including necessary background investigations and government issued security clearances.
Responsibilities
Serve as a primary owner for the reliability, availability, performance, operability, and capacity of one or more production services
Deploy, operate, maintain, and continuously improve production services running in Autodesk GovCloud environments
Partner with engineering teams to ensure services are designed with reliability, scalability, security, and operability in mind
Define and operate reliability practices such as SLOs/SLIs, error budgets, production readiness reviews, service reviews, and operational health reviews
Build automation to improve deployment safety, operational efficiency, incident response, and service recovery
Design, develop, and maintain software, automation, and tooling that improve the reliability, scalability, and efficiency of production systems
Implement and improve monitoring, alerting, logging, tracing, and observability capabilities across supported services
Lead and participate in incident response, troubleshooting, and post-incident reviews focused on learning and continuous improvement
Develop and maintain operational documentation, runbooks, and recovery procedures
Scale and enhance resilience testing and Gameday practices to validate system behavior, recovery capabilities, and operational readiness
Continuously identify and eliminate operational toil through software engineering, automation, and process improvement
Ensure supported services remain compliant with Autodesk security, privacy, and regulatory requirements, including FedRAMP and related controls where applicable
Participate in a 24x7 on-call rotation for production services
Function effectively in a fast-paced environment while helping establish and mature operational excellence practices for Autodesk GovCloud
Minimum Qualifications
B.S. or higher in Computer Science, Engineering, or a related technical discipline, or equivalent practical experience
7+ years of experience in Site Reliability Engineering, Software Engineering, Platform Engineering, Cloud Infrastructure, or Production Operations
Experience operating and supporting customer-facing production services in large-scale cloud environments
Strong understanding of reliability engineering principles, including SLOs/SLIs, observability, incident management, capacity planning, production readiness, and automation
Experience with AWS, Azure, or other public cloud platforms
Experience developing automation using languages such as Python, Go, Java, PowerShell, Bash, or similar
Experience with Infrastructure as Code, CI/CD pipelines, deployment automation, and modern cloud operations practices
Understanding of security, compliance, and operational risk management in production environments
Strong written and verbal communication skills
Preferred Qualifications
10+ years of experience operating highly available, customer-facing production systems
Experience with AWS GovCloud, FedRAMP, IL4/IL5, or other regulated cloud environments
Experience supporting services with stringent availability, reliability, and security requirements
Experience with containers, Kubernetes, cloud-native architectures, APIs, load balancing, networking, DNS, and distributed systems
Experience with observability platforms such as Splunk, Dynatrace, Datadog, CloudWatch, or similar technologies
Experience operating databases, storage platforms, messaging systems, caching technologies
Experience designing and implementing operational automation at scale
Experience leading or participating in Gamedays, disaster recovery exercises, resilience testing, or operational readiness reviews
Strong incident management experience, including technical leadership during major incidents and stakeholder communication
Strong collaboration skills and ability to work effectively across engineering, security, compliance, and operations teams
Passion for building reliable, secure, and scalable systems that customers can trust
Learn More
About Autodesk
Welcome to Autodesk! Amazing things are created every day with our software – from the greenest buildings and cleanest cars to the smartest factories and biggest hit movies. We help innovators turn their ideas into reality, transforming not only how things are made, but what can be made.
We take great pride in our culture here at Autodesk – it’s at the core of everything we do. Our culture guides the way we work and treat each other, informs how we connect with customers and partners, and defines how we show up in the world.
When you’re an Autodesker, you can do meaningful work that helps build a better world designed and made for all. Ready to shape the world and your future? Join us!
Benefits
From health and financial benefits to time away and everyday wellness, we give Autodeskers the best, so they can do their best work. Learn more about our benefits in the U.S. by visiting https://benefits.autodesk.com/
Salary transparency
Equal Employment Opportunity
At Autodesk, we're building a diverse workplace and an inclusive culture to give more people the chance to imagine, design, and make a better world. Autodesk is proud to be an equal opportunity employer and considers all qualified applicants for employment without regard to race, color, religion, age, sex, sexual orientation, gender, gender identity, national origin, disability, veteran status or any other legally protected characteristic. We also consider for employment all qualified applicants regardless of criminal histories, consistent with applicable law.
Belonging
We take pride in cultivating a culture of belonging where everyone can thrive. Learn more here: https://www.autodesk.com/company/global-belonging
Are you an existing contractor or consultant with Autodesk?
Please search for open jobs and apply internally (not on this external site).
Stop the endless job search. Our AI finds and applies to the best jobs for you.
Discover remote opportunities in Site Reliability Engineer
Answer easy questions
200,000+ jobs across 15+ categories
Get your best job matches
Only hand-screened, legit jobs
Find a remote job faster
No ads, scams, or junk
“ I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!