Lead the PMO and executive operations function to drive cross-functional execution of complex datacenter, manufacturing, and infrastructure programs. Manage the operating cadence for senior leadership, including QBRs and strategic planning, while overseeing a team of senior project managers.
Nscale
54 Remote Job Openings at Nscale
Lead the design, standardization, and continuous improvement of operational processes across Nscale's Business Operations. Create scalable playbooks and KPI frameworks to align site operations, manufacturing, and cross-functional partner teams.
Lead the BI and financial analytics function to drive executive decision-making through dashboards, models, and reporting systems. Oversee FP&A operations and accounting workflows while partnering with cross-functional teams to translate operational data into financial insights.
Administer global equity programs including grants, vesting, and exercises while ensuring data integrity across HRIS and payroll systems. Provide reporting, analysis, and employee support to ensure a high-quality experience throughout the equity lifecycle.
Lead the utility strategy and technical execution for North American data centers, focusing on high-density AI and GPU workloads. Own the end-to-end process of securing and optimizing power supply, including grid engagement and on-site generation design.
Lead strategic sourcing for data center infrastructure equipment and materials, managing the end-to-end category strategy. This includes negotiating high-value framework agreements and executing competitive bidding processes to optimize costs and supply continuity.
Provide first and second-line technical support for internal teams and external customers via ticketing, email, and chat. Diagnose and resolve hardware, software, and cloud infrastructure issues while maintaining a comprehensive knowledge base.
Ensure the efficiency, reliability, and scalability of data center infrastructure while managing critical incidents and customer support tickets. Collaborate with engineering teams to improve observability, automate diagnostics, and optimize GPU cloud performance.
Build and manage a Python-based Fleet Manager platform for the automated provisioning, testing, and remediation of GPU nodes and network switches. Design distributed workflow orchestration systems to coordinate complex hardware lifecycle operations and ensure infrastructure reliability.
Provide first and second-line technical support for customers using the AI GPU cloud platform. Diagnose and resolve infrastructure and connectivity issues while collaborating with engineering teams to improve platform stability.
The role involves designing and building data workflows and architectures using Palantir Foundry to support various business functions. You will be responsible for the end-to-end delivery of use cases, data governance, and the integration of AI data fabrics.
Lead the security strategy and architecture for Nscale's SaaS and enterprise application ecosystem to reduce risk from shadow IT. Partner with Identity, Legal, and IT teams to implement scalable security controls, access management, and governance frameworks.
Acts as the regional finance lead for US operations, bridging the gap between local operational realities and global finance processes. Responsible for month-end close, statutory reporting, and ensuring strong internal controls and compliance across regional entities.
Own and execute the USA go-to-market strategy to drive enterprise adoption and pipeline growth. Lead regional marketing efforts across field activation, ABM, partner programs, and event conversion.
Produce high-quality video content across social media, brand campaigns, and event coverage from rough cut to final delivery. Collaborate with creative teams to translate briefs into polished visual narratives and maintain a library of reusable assets.
Design and operate scalable bare metal provisioning platforms using OpenStack Ironic to support GPU cloud infrastructure. Manage the full hardware lifecycle and collaborate with cross-functional teams to ensure platform reliability and community alignment.
Principal Technical Program Manager (TPM) - AI Infrastructure Operations
Nscale
·
Full Time
·
6 days ago
Nscale
Drive complex, cross-functional programs to ensure the stability and growth of GPU fleets and Infiniband network fabrics. Establish and track critical infrastructure KPIs such as availability and uptime while optimizing operational workflows.
Design and scale detection and response capabilities across infrastructure, cloud, and endpoint environments. Build SIEM pipelines and implement AI-driven automation to reduce response times and improve alert fidelity.
Lead treasury policies, governance, and liquidity risk oversight while ensuring covenant compliance across various loan agreements. Develop senior-level reporting for executive decision-making, lender engagement, and rating agency readiness.
Build and scale Nscale's threat intelligence capability by translating adversary insights into actionable security outcomes. Integrate intelligence into SIEM pipelines and automation workflows to improve detection and response across cloud and enterprise systems.
Define and lead the identity and access architecture, including authentication and authorization strategies across infrastructure and platform systems. Establish zero trust principles and scale workload identity patterns for highly distributed GPU-based environments.
Support the execution of project-based financing and debt capital markets transactions specifically for the Americas region. This includes building transaction models, managing data rooms, and coordinating with legal and accounting teams for documentation.
Senior Analyst, Treasury - Financing and Debt Capital Markets
Nscale
·
Full Time
·
6 days ago
Nscale
Support the execution of project-based financing and debt capital markets transactions, primarily for the Americas region. This includes building transaction models, managing data rooms, and coordinating with legal and accounting teams for covenant compliance.
Lead end-to-end project management for NetSuite and operational accounting system initiatives, ensuring delivery on time and within scope. Coordinate cross-functional integrations between accounting, IT, and procurement while maintaining SOX compliance and governance.
Manage the end-to-end purchase order lifecycle in NetSuite, including creation, maintenance, and validation of requests. Coordinate with procurement, finance, and accounts payable teams to ensure operational efficiency and policy compliance.
Lead transaction advisory and technical diligence for strategic investments, M&A, and large-scale AI infrastructure deployments. Act as a senior advisor to executive leadership to ensure technical feasibility and commercial soundness before capital commitment.
Design and operate the endpoint and device security foundation across employee, engineering, and privileged admin devices. Establish secure baseline standards and integrate device posture with identity and access management workflows.
Build and operate Nscale's privileged access operating model across enterprise systems, SaaS, and production environments. Focus on implementing JIT access, break-glass procedures, and automated revocation to eliminate standing privileges.
Staff Security Engineer - Security Data, Detection and Automation
Nscale
·
Full Time
·
6 days ago
Nscale
Build the telemetry, detection, and response automation foundation for Nscale's SOC capability. Focus on turning raw telemetry into high-signal security outcomes and measuring SOC performance through case-quality metrics.
Build and improve automation, tooling, and infrastructure to support AI workloads and platform services. Participate in incident response, troubleshooting, and the maintenance of SLOs/SLIs to ensure system stability.
Provide first and second-line technical support for internal teams and external customers via ticketing, email, and chat. Diagnose and resolve hardware, software, and cloud infrastructure issues while maintaining a knowledge base and improving support workflows.
Lead strategic growth initiatives, capital formation, and transaction execution within the digital infrastructure sector. This includes managing M&A opportunities, structuring debt facilities, and developing sophisticated financial models to inform investment decisions.
Build and operate shared Kubernetes-based platform foundations to support AI applications and services at scale. Focus on improving reliability, scalability, and observability while reducing manual operational toil through automation.
Define the technical direction and architecture for Nscale's Slurm-based HPC platform domain. Lead cross-functional initiatives to integrate HPC systems with cloud-native APIs and infrastructure to ensure reliability and scalability.
Principal Site Reliability Engineer - AI Infrastructure Operations
Nscale
·
Full Time
·
6 days ago
Nscale
Lead the long-term reliability strategy and design large-scale control-plane systems for AI and HPC infrastructure. Act as a technical authority for automation and operational architecture while mentoring senior engineers to improve platform availability.
Lead offensive security operations and adversary simulations to identify vulnerabilities across cloud, infrastructure, and identity systems. Collaborate with detection and response teams through purple teaming to strengthen the organization's overall security posture.
Lead the technical direction and operational lifecycle of high-performance RDMA network fabrics for a global AI GPU cloud. Drive automation, reliability, and performance improvements while serving as the subject matter expert for interconnect networks.
Technical Program Manager - Enterprise Security, Metrics and Business Intelligence
Nscale
·
Full Time
·
6 days ago
Nscale
Lead the delivery of enterprise security programs by turning technical work into measurable outcomes and executive reporting. Establish operating mechanisms and BI dashboards to track security coverage, risk reduction, and audit readiness.
Own the analytics layer for GPU procurement and hardware supply chain by building datasets, models, and reporting products in Palantir Foundry. Act as the primary link between technical data engineering and business stakeholders to translate decisions into analytical requirements.
Lead and coordinate international deployment teams to deliver large-scale GPU infrastructure in next-gen AI datacenters. Oversee technical layout, cabling designs, and operational readiness while acting as the senior onsite authority.
Responsible for the physical execution, design, and layout of large-scale GPU infrastructure projects within datacenters. This includes managing BOMs, ensuring environmental limits are met, and overseeing structured cabling and initial network configurations.
Lead treasury policies, governance, and liquidity risk oversight while managing bank reporting and covenant compliance. Own short and medium-term liquidity planning, treasury controls, and support capital allocation and financing strategies.
Execute corporate and project-level financing transactions from concept to closing, specifically for the Americas region. Manage transaction workstreams, build complex financial models, and coordinate with legal and accounting teams to optimize capital structure.
Design and build scalable data pipelines and models to create a digital twin of Nscale's operational signals. Develop trusted datasets and metrics to enable self-serve analytics for capacity planning and cost optimization.
Design and build scalable data pipelines and models to create a digital twin of Nscale's operational signals. Develop trusted datasets and metrics to enable self-serve analytics for capacity planning, cost optimization, and customer reporting.
Lead pricing execution and commercial economic integrity for US operations, ensuring deal feasibility and margin governance. Translate infrastructure capacity and business strategy into executable financial outcomes and commercial inventory availability.
Lead the development of operational analytics and reporting frameworks to provide visibility into AI infrastructure performance and reliability. Partner with cross-functional teams to define KPIs and transform data from Jira and ServiceNow into actionable insights for executive and customer reporting.
Lead the end-to-end operational management of data center portfolios to ensure reliability, safety, and compliance. Drive the strategic vision for infrastructure scaling and manage cross-functional teams to support a high-performance AI cloud platform.
Act as a high-impact operator solving the company's most critical and ambiguous strategic and operational challenges. Drive end-to-end delivery of projects across various functions including Commercial, Infrastructure, Product, and Finance.
Own the end-to-end delivery of multiple concurrent enterprise system rollouts from initial discovery through go-live and BAU handoff. This includes synthesizing stakeholder requirements into scalable solution designs and managing third-party implementation partners.
The role is responsible for maintaining the accuracy and integrity of the general ledger, managing month-end close procedures, and ensuring compliance with accounting standards. It involves performing detailed reconciliations and collaborating cross-functionally to support financial reporting and data integrity.
Director, Mechanical Engineering – North America, Data Centers
Nscale
·
Full Time
·
6 days ago
Nscale
Lead mechanical design ownership and engineering governance for North American data centers, focusing on high-density AI and liquid-cooled GPU infrastructure. Manage the full design lifecycle from concept to commissioning while leading a high-performing team of engineers.
Own and optimize the Incident and Change Management processes, including implementing workflows in Jira Service Management and chairing the Change Advisory Board. Act as the Incident Commander for major events and report program health metrics to the senior leadership team.
Lead the architecture and delivery of Nscale's vertically integrated managed AI services stack, focusing on APIs, services, and control planes. Partner with cross-functional teams to ensure scalability, security, and operational excellence across distributed systems.