pod network

Site Reliability Engineer (APAC)

Posted a month ago

United Kingdom

⭐ 2-5 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

The SRE will operate, improve, and scale the reliability of the pod platform, focusing on incident resolution and platform health. They will design observability tooling and automate operational workflows to eliminate recurring pain points.

🚀About Pod Network

Pod is building a next-generation decentralized exchange focused on fairness, performance, and user experience. We believe traders shouldn't have to choose between speed, simplicity, and fair treatment, so we're building an exchange that delivers all three while enabling entirely new kinds of financial markets.

Under the hood, pod is powered by low-latency systems designed for fast settlement and strong guarantees around ordering, timing, and execution. These are challenging engineering problems, and the reliability of the platform depends on operating those systems safely and effectively at scale.

🛠️ The Role

We're looking for our first Site Reliability Engineer to help operate, improve, and scale the reliability of the pod platform.

You'll join a team of engineers who already share responsibility for production systems and participate in an established on-call rotation. From day one, you'll work closely with the broader engineering team while taking ownership of the tooling, processes, and operational practices that keep the platform running smoothly.

This is a hands-on role for someone who enjoys operating complex systems, investigating difficult production issues, and building the automation and infrastructure that turn reliability into a competitive advantage.

📟 On-call

You'll be responsible for platform health during Asian business hours as part of our existing engineering on-call rotation. There are no permanent overnight shifts, and you'll never be the sole person responsible for the platform—the rest of the rotation is covered by the wider team. Occasionally, you may flex outside your normal hours to help cover the schedule, but that's the exception rather than the rule.

🎯What You'll Do :

🔍Respond to and resolve incidents

* Monitor the health and performance of the platform

* Respond to production incidents and drive them through to resolution

* Investigate failures, identify root causes, and coordinate fixes

* Ensure issues are detected, understood, and addressed quickly

⚡Improve platform reliability

* Identify recurring operational pain points and eliminate them

* Improve software, deployment processes, and operational workflows

* Participate in incident reviews and help drive preventative improvements

* Contribute reliability-focused changes directly to production systems

📈 Build observability and operational tooling

* Design and maintain dashboards, metrics, alerting, and monitoring systems

* Improve signal quality while reducing alert fatigue

* Build automation and internal tools that make the platform easier to operate

* Help establish reliability best practices across the engineering organization

✅ Requirements

* Strong experience with Linux and cloud infrastructure

* Experience operating and supporting production systems

* Experience with Docker and containerized environments

* Experience with observability and incident-management tools such as Grafana, Prometheus, PagerDuty, or similar

* Ability to automate workflows using Rust, Python, Bash, or similar languages

* Strong troubleshooting and debugging skills

* A high degree of ownership and the ability to make sound decisions independently

➕Nice to Have

* Experience with distributed systems

* Experience operating high-availability, low-latency services

* Experience with CI/CD systems and deployment automation

* Experience designing secure operational workflows and access controls

No prior blockchain or cryptocurrency experience is required.

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

pod network

Site Reliability Engineer (APAC)

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Pre-Sales Solutions Architect

Senior Full Stack Developer with AI Experience (Microsoft Technologies)- Ahmedabad India

Solutions Engineer

Solutions Engineer

Business Analyst (Risk Technology)

Software Engineer (AI Platform & Backend)

pod network

Site Reliability Engineer (APAC)

AI Summary

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Pre-Sales Solutions Architect

Senior Full Stack Developer with AI Experience (Microsoft Technologies)- Ahmedabad India

Solutions Engineer

Solutions Engineer

Business Analyst (Risk Technology)

Software Engineer (AI Platform & Backend)

Personalize your Remote Job Search in 3 Easy Steps!