About the Role
As a Monitoring and Observability Architect, you will lead our observability infrastructure's design, implementation, and optimization. This role is critical to ensuring our software applications and systems' performance, reliability, and cost-effectiveness. You'll work cross-functionally to integrate tools like Grafana, VictoriaMetrics, Sumo Logic, and CloudWatch while balancing technical excellence with budget-conscious solutions.
What Youโll Do
- Develop and execute scalable, cost-effective observability strategies for cloud and on-prem systems using tools such as Grafana, VictoriaMetrics, and Sumo Logic
- Design and maintain secure, high-performance monitoring architectures and dashboards that provide actionable system insights
- Define observability roadmaps that align with business goals, customer needs, and budget constraints
- Integrate and manage monitoring platforms within existing infrastructure, ensuring efficient data retention, optimized queries, and minimal resource overhead
- Implement automation for alerting and remediation to reduce manual intervention and operational costs
- Partner with infrastructure, development, and operations teams to improve system reliability and align observability objectives across the organization
- Evaluate and introduce new monitoring tools and technologies, staying current with industry best practices
- Provide architectural guidance and establish standards for observability implementations with a strong focus on cost optimization and compliance
What Youโll Need
- 7+ years of experience in observability engineering or a related role
- Deep expertise in Grafana, VictoriaMetrics, and Sumo Logic
- Familiarity with observability tools such as Splunk, ServiceNow, Dynatrace, ScienceLogic, AWS CloudWatch, and Azure Monitor
- Experience building CI/CD-integrated observability pipelines
- Proficiency with Infrastructure-as-Code (IaC) using Terraform to manage alerts, dashboards, and integrations
- Strong understanding of observability frameworks, especially OpenTelemetry (OTEL), and ability to align solutions with developer experience, compliance, and cost control
- Hands-on experience with observability in Kubernetes environments (Helm, operators, native tooling) and monitoring EC2/VMs, on-prem systems, and AWS services
- Proven ability to build and implement application metric streaming frameworks
- Excellent problem-solving, communication, and collaboration skills
Applicants must be authorized to work for any employer in the U.S. DriveWealth is unable to sponsor or take over sponsorship of an employment Visa at this time