Design and deploy scalable spine-leaf network architectures and high-performance Ethernet fabrics to support GPU clusters and AI workloads. Develop automation and Infrastructure-as-Code solutions while optimizing traffic flows for AI training and inference.
Lightning AI
5 Remote Job Openings at Lightning AI
Manage full-cycle recruiting for AI software engineering, infrastructure, and product teams. Focus on building a diverse candidate pipeline and improving scalable, equitable interview processes.
Design and operate large-scale GPU infrastructure platforms to minimize incidents and enable customer features. Collaborate across engineering teams to automate operational workflows and participate in an on-call rotation.
This is a general talent community expression of interest rather than a specific role. Candidates are invited to submit their information to be considered for future opportunities across the company's global hubs.
Own and evolve a scalable observability platform for metrics, logs, and traces across GPU-enabled bare-metal infrastructure. Design multi-tenant telemetry pipelines and noise-resistant alerting systems to support both internal operations and external customers.