OXIO Corporation

Site Reliability Engineer

Posted a month ago

United States

⭐ 2-5 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Design and implement cloud platforms to support backend services while automating technical operations like deployment and scaling. Monitor mission-critical infrastructure to ensure maximum uptime and participate in on-call rotations and blameless postmortems.

Site Reliability Engineer

OXIO is the first NeoTelco. We are building the world’s largest, most accessible, and insightful Telecom network. Our platform empowers anyone to spin up their own carrier from a browser, scaling and supporting you as you scale your network to millions of users.

We ensure that users and devices are connected, and stay connected wherever they go: Cross- country, carrier, or cellular technology. We help them pay less for mobile data. This technology is provided through our Carrier-as-a-Service platform: BrandVNO, a fully customizable telecom service. In addition, we enable clients of our service to extract the value from telecom data - enriching their customer experience, business intelligence, and product understanding in the many markets in which we operate.

Come join us in creating a modern technology platform with a group of engineers dedicated to advancing our vision. Our team is passionate about what we build, open to new ideas and challenges, and has our sights set on the future of connectivity.

Responsibilities

Design and implement platform on the cloud to support OXIO backend services
Automate technical operations: deployments, scaling, recovery, etc.
Monitor and maintain mission-critical production infrastructure to ensure maximum uptime
Participate in an on-call rotation and culture of continuous improvement through blameless postmortems
Enable the Engineering/Telecom/Data Engineering teams by providing them the tools to operate the service they build

Essentials

Understanding of Linux/Unix systems (most systems are Linux-based).
Familiarity with Linux/Unix system internals like process management, filesystems, memory management, and networking.
Proficiency in at least one programming language (Python, Go, or Ruby) and strong skills in scripting (Bash, Perl).
Experience with infrastructure provisioning tools such as Terraform, CloudFormation, or Ansible.
Familiarity with containerization (Docker) and orchestration tools (Kubernetes).
Familiarity with monitoring tools like Prometheus, Grafana, or Datadog.
Knowledge of setting up alerts, analyzing logs, and creating dashboards for observability.
Familiarity with incident management practices (e.g., runbooks, postmortems).
Experience in being part of an on-call rotation and handling incidents.
Experience in setting up and maintaining Continuous Integration/Continuous Delivery pipelines (Jenkins, GitLab CI, CircleCI, etc.).
Hands-on experience with cloud providers (AWS, Google Cloud, Azure).
Knowledge of virtualization technologies (VMware, KVM) and cloud-native architecture.
Understanding of TCP/IP, DNS, HTTP/HTTPS, load balancing, and firewalls.

Nice to have

Strong understanding of deployment strategies (canary releases, blue-green deployments, etc.).
Familiarity with high availability and understanding failover mechanisms.
Familiarity with IAM (Identity and Access Management) and zero trust principles.
Experience working with distributed systems (e.g., Kafka, Cassandra, Elasticsearch).

Building custom monitoring tools or writing complex automation scripts.

Functional knowledge of database management (SQL and NoSQL).

Familiarity with distributed tracing (Jaeger, OpenTelemetry) and advanced log aggregation strategies (ELK stack, Splunk).
Familiarity with performance profiling tools and optimizing application performance under heavy load.

Familiarity in load testing and identifying bottlenecks.

Familiarity with Configuration Managment using SaltStack for maintaining server configurations.

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

OXIO Corporation

Site Reliability Engineer

AI Summary

Essentials

Nice to have

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Sr. Data Scientist - Clinical Analytics - Remote

Application Analyst, Senior, ERP (Workday-Finance) - Remote

Senior AWS/Cloud Database Engineer

LATAM I Analytics Engineer

SQL Server Database Administrator (Mid-Level)

Senior AI Engineer (C# / .NET)

OXIO Corporation

Site Reliability Engineer

AI Summary

Essentials

Nice to have

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Sr. Data Scientist - Clinical Analytics - Remote

Application Analyst, Senior, ERP (Workday-Finance) - Remote

Senior AWS/Cloud Database Engineer

LATAM I Analytics Engineer

SQL Server Database Administrator (Mid-Level)

Senior AI Engineer (C# / .NET)

Personalize your Remote Job Search in 3 Easy Steps!