Bridge 351

T3 Operations & Support Specialist – Compute & OS (Full Remote - Europe)

Posted 2 hours ago

Portugal

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Hold Tier-3 operational ownership for Compute and Operating System services to ensure platform stability and security compliance. Drive automation to reduce operational toil and manage complex incidents for the EDP Platform.

T3 Operations & Support Specialist – Compute & OS

🏛️ About the Role

We are looking for a senior T3 Operations & Support Specialist (Compute & OS) to join its Local Operations Germany team. In this role, you will hold Tier-3 operational ownership for Compute and Operating System services supporting the EDP Platform — a cloud-native, hybrid platform built to accelerate software product delivery across the energy sector. You will be responsible for handling complex incidents, ensuring platform stability and security compliance, and driving automation to reduce operational toil, directly supporting production business applications for 50Hertz.

🛠️ Key Responsibilities

Complex incident management & troubleshooting

Handle complex incidents, perform deep troubleshooting and root cause analysis, and drive permanent fixes and preventive measures.
Ensure compute/OS readiness for releases and changes, including monitoring/alerting coverage, performance baselines, hardening, patch strategy, rollback and recovery procedures, and runbooks.
Execute and continuously improve standard operational procedures through automation to reduce toil and improve MTTR and stability.
Coordinate with Kubernetes/Data and Network/Storage SMEs to resolve cross-domain production issues.

Operational readiness for deployments

Validate deployment artifacts from an operations perspective.
Define and enforce quality assurance measures (documentation of standard operation procedures, successful test reports) to ensure high quality of delivered products and services.
Ensure rollback strategies and operational monitoring (observability) are in place for every production deployment.

Platform stability & Kubernetes operations

Monitor system health, performance metrics, and service availability across multi-tenant environments.
Identify, analyze and resolve incidents, minimizing service disruption.
Trigger root cause analysis and implement corrective and preventive actions.

Automation & service reliability

Address recurring operational issues by automating remedial standard operations processes.
Validate all automated procedures following the established software development lifecycle including staging, testing, and validation reviews.

Security & compliance

Implement monitoring and logging strategies to support audit and compliance requirements.
Perform routine security scans and remediate identified vulnerabilities.

✅ Mandatory Experience

5–10+ years in IT operations / service delivery / platform operations with demonstrated leadership in mission-critical environments.
Proven experience implementing and leading Incident, Problem, Change, and Release governance in production.
Virtualization with VMware 8.
Operating Systems: Red Hat Enterprise Linux and Ubuntu.
Operating Systems Tools: Satellite, IPA, Certificate Server.
ITSM / Collaboration tools: Jira Service Management (JSM), Jira, Confluence.
Fundamental understanding of core operations processes (incident management, change management, problem management, IT Service Management) and SRE concepts.
Experience gathering operational insights from monitoring and observability tooling, including SLI/SLA/SLO management and tracking.
Hands-on experience documenting procedures and enforcing clear runbooks and playbooks.
Hands-on experience with monitoring and logging tools: Prometheus, Grafana, Datadog, Mimir, Loki.
Understanding of modern platform operations (Kubernetes/containers, automation, observability) sufficient to govern specialists.
GitOps and IaC awareness: Terraform/OpenTofu, ArgoCD, Helm.

🌐 Languages

English — minimum C1 CEFR level, spoken and written. Mandatory.
German — minimum C1 CEFR level, spoken and written. Mandatory.

⭐ Preferred Experience

Experience operating in regulated or high-availability industries (banking, telco, public sector, healthcare).
Experience with SRE practices (SLOs/SLIs, error budgets) and reliability management.
Familiarity with enterprise DevOps toolchains: GitLab, JFrog Artifactory, Backstage, Harness.

📍 Location & Work Model

Remote from Europe - Occasional travel required.
Full-Time

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

Bridge 351

T3 Operations & Support Specialist – Compute & OS (Full Remote - Europe)

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Remote Customer Support Advisor

Remote Customer Service Representative (WFH)

Virtual Customer Support Specialist – WFH

Remote Customer Service Success Representative

Customer Service Representative – Beginner Level (Work From Home)

Customer Service Representative – Online WFH

Bridge 351

T3 Operations & Support Specialist – Compute & OS (Full Remote - Europe)

AI Summary

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Remote Customer Support Advisor

Remote Customer Service Representative (WFH)

Virtual Customer Support Specialist – WFH

Remote Customer Service Success Representative

Customer Service Representative – Beginner Level (Work From Home)

Customer Service Representative – Online WFH

Personalize your Remote Job Search in 3 Easy Steps!