capital.com

Senior SRE Engineer (Observability Focus)

Posted 25 days ago

Bulgaria, Cyprus, Poland

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Design and operate the end-to-end telemetry stack including metrics, logs, and traces across hybrid AWS and on-premise environments. Build actionable Grafana dashboards and mentor engineering teams on observability practices and structured logging standards.

We are a leading trading platform that is ambitiously expanding to the four corners of the globe. Our top-rated products have won prestigious industry awards for their cutting-edge technology and seamless client experience. We deliver only the best, so we are always in search of the best people to join our ever-growing talented team.

We're building out our observability practice and need a senior engineer who can own it end to end. This is a hands-on role. You'll design and operate the telemetry stack that gives our engineering teams real visibility into production — across a hybrid AWS and on-premise environment, at scale.

Responsibilities:

Own the full observability stack: metrics (VictoriaMetrics), logs (OpenSearch), and traces (OpenTelemetry) — from pipeline design to day-2 operations.
Architect and run VictoriaMetrics cluster topology (vmstorage/vminsert/vmselect), including vmagent scraping, remote write configuration, vmalert rules, and cardinality control.
Operate OpenSearch clusters: index lifecycle management (ISM), hot-warm-cold architecture, shard tuning, and ingest pipelines via Data Prepper.
Build and maintain OTEL Collector pipelines — receivers, processors, exporters — and instrument services across Java, Python, and JS/TS stacks (auto and manual).
Run Kafka as the telemetry transport layer (OTEL Collector → Kafka → backends), including topic design, partition strategy, consumer group lag monitoring, and throughput tuning for high-volume telemetry.
Manage log shipping infrastructure using Fluent Bit, Vector, or Fluentd; define structured logging standards and field normalization across services.
Build Grafana dashboards and alerting that engineers actually use — clear, actionable, with well-structured variables and thresholds.
Work with platform and application teams to improve sampling strategies (head/tail), batching, and context propagation across distributed services.
Contribute to incident response, post-mortems, and reliability improvements driven by observability signals.
Mentor engineers on observability practices, tooling, and structured logging standards.

Requirements:

6+ years in a DevOps, SRE, or platform engineering role, with at least 2 years focused on observability tooling at production scale.
Deep hands-on experience with VictoriaMetrics (or Prometheus) — MetricsQL/PromQL, exporters, service discovery, remote write, downsampling, and retention management.
Solid OpenSearch or Elasticsearch skills: cluster operations, Query DSL, ISM policies, and ingest pipeline design.
Production experience with OpenTelemetry: Collector configuration, OTLP, context propagation, and instrumentation across multiple languages.
Strong Kafka skills — producer/consumer patterns, consumer group management, Kafka Connect, Schema Registry, and JMX-based monitoring. Strimzi experience a plus if you've run Kafka on Kubernetes.
Proficiency with log shippers (Fluent Bit, Vector, Fluentd) and structured log parsing/normalization.
Working knowledge of Kubernetes (operators, Helm), Argo CD/GitOps, and Terraform/Ansible.
Comfortable in a hybrid AWS + on-prem environment; solid understanding of networking as it applies to scraping and shipping pipelines.
Scripting ability in Bash or Python for automation and tooling.
Strong communication skills — you can explain observability tradeoffs clearly to engineers and non-engineers alike.
English proficiency.

What you will get in return:

• Competitive Salary: We believe great work deserves great pay! Your skills and talents will be rewarded with a salary that makes you feel valued and motivated.

• Work-Life Harmony: Join a company that genuinely cares about you - because your life outside of work matters just as much as your time on the clock. #LI-Hybrid

• Generous Time Off: Need a breather? Our annual leave policy lets you recharge and enjoy life outside of work without a worry.

• Employee Referral Program: Love working here? Share the love! Bring your talented friends on board and get rewarded for growing our awesome team.

• Comprehensive Health & Pension Benefits: From medical insurance to pension plans, we’ve got your back. Plus, location-specific benefits and perks!

• Workation Wonderland: Live your digital nomad dreams with 30 extra days to work remotely from anywhere in the world (some restrictions apply). Adventure awaits!

• Volunteer Days: Make a difference! Take two additional paid days each year to support causes you care about and give back to the community.

Be a key player at the forefront of the digital assets movement, propelling your career to new heights! Join a dynamic and rapidly expanding company that values and rewards talent, initiative, and creativity. Work alongside one of the most brilliant teams in the industry.

Our company has an Internal Reporting Procedure. It is available from the Human Resources Department upon request hr@capital.com. You may report a violation referred to in the Procedure under the terms specified therein.

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

capital.com

Senior SRE Engineer (Observability Focus)

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Chief Engineer II, Southcenter - Full Time

Senior Analyst - BCG Vantage, B2B SaaS (Client Focus Track)

Senior Analyst - BCG Vantage, Infrastructure, Transport, and City Development (P-ITC)

Senior Analyst - BCG Vantage, B2B SaaS (Client Focus Track)

Senior Analyst - BCG Vantage, B2B SaaS (Client Focus Track)

Site Reliability Engineer / Software Architect

capital.com

Senior SRE Engineer (Observability Focus)

AI Summary

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Chief Engineer II, Southcenter - Full Time

Senior Analyst - BCG Vantage, B2B SaaS (Client Focus Track)

Senior Analyst - BCG Vantage, Infrastructure, Transport, and City Development (P-ITC)

Senior Analyst - BCG Vantage, B2B SaaS (Client Focus Track)

Senior Analyst - BCG Vantage, B2B SaaS (Client Focus Track)

Site Reliability Engineer / Software Architect

Personalize your Remote Job Search in 3 Easy Steps!