Lead IT Incident Management Analyst

 Posted a day ago
     
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Lead and optimize the Incident Management practice to ensure service-impacting events are managed, documented, and communicated effectively. Coordinate major incident response activities, facilitate post-incident reviews, and manage SLA governance to drive continuous service improvement.

JOB SUMMARY


The Lead Incident Management Analyst supports Empyrean’s IT Service Management program by leading and optimizing the Incident Management practice across IT Operations. This role provides operational leadership, process stewardship, and ServiceNow ITSM expertise to ensure service-impacting events are managed consistently, documented accurately, communicated effectively, and used to drive continuous improvement.

 

This position leads Incident Management and Major Incident coordination, including triage support, escalation coordination, service restoration tracking, business-impact communication, post-incident documentation, SLA governance, and mitigation- action follow-through. The role partners with internal teams, service owners, leadership stakeholders, and external partners to improve incident response, strengthen service reliability, and support high-quality IT service delivery. This role leads through process authority, coordination, and influence; formal technical resolution ownership remains with the appropriate service owners, platform owners, or technical teams.

 

This role may require occasional after-hours participation in urgent or high-impact incident response activities based on business impact, service criticality, or escalation requirements.

 

ESSENTIAL DUTIES AND RESPONSIBILITIES


ITSM / Incident Management

 

  • Lead Incident Management and Major Incident response activities, including triage, escalation, bridge facilitation, service restoration tracking, documentation, and corrective action follow-up.
  • Coordinate response across technical, operational, business, client-facing, and external partner teams during service-impacting events, including urgent or high-impact incidents outside standard business hours as needed.
  • Apply incident priority, impact, urgency, and Major Incident criteria to support timely escalation, SLA application, leadership engagement, and communication workflows.
  • Assess and document service impact, recovery actions, decisions, timelines, postmortems, and remediation activities for major or service-impacting incidents.
  • Maintain clear incident communications by translating technical details into business-impact updates, executive summaries, operational communications, and client-facing talking points as appropriate.
  • Facilitate post-incident reviews and track lessons learned, recurring issues, corrective actions, risks, owners, and due dates through completion.
  • Manage Incident Management SLA/OLA governance, including definitions, reporting, breach review, response and restoration performance, escalation adherence, and continuous improvement.
  • Analyze incident trends, SLA performance, aging work, recurring issues, and service risks to identify opportunities for operational improvement and service stability.
  • Escalate non-response, ownership gaps, SLA risks, and service-impacting blockers through established ITSM and leadership channels.
  • Support related ITSM processes, including Problem Management follow-up, Change Management coordination for service-impacting work, and request SLA visibility where fulfillment performance affects service delivery or operational risk.
  • Act as a ServiceNow process partner for Incident Management, SLA Management, reporting, dashboards, and related ITSM workflows.

 

Monitoring and Observability

 

  • Partner with observability and technical teams to identify monitoring, alerting, service health, and metric visibility improvements based on incident findings, SLA trends, service risks, and operational gaps.

 

General / Cross-Functional Operations

 

  • Support operational reporting, dashboard refinement, data validation, SOPs, communication standards, and knowledge artifacts related to Incident Management, SLA Management, service recovery, and post-incident improvement.
  • Facilitate incident, problem, SLA, and process improvement discussions; mentor teams in process adherence, documentation quality, SLA awareness, and ITSM maturity.
  • Contribute operational insight to improve ITSM services, incident practices, reporting needs, and ServiceNow roadmap priorities.

 

REQUIRED SKILLS AND ABILITIES


ITSM / Incident Management

 

  • Practical technical understanding of application, infrastructure, network, cloud, and end-user support environments, with the ability to guide technical discussions, identify dependencies, challenge unclear updates, and drive incident calls toward timely service restoration.
  • Strong operational leadership in Incident Management, Major Incident coordination, and ITSM process execution.
  • Strong understanding of ITIL-aligned Incident, Problem, Change, Knowledge, Request, and SLA Management practices.
  • Ability to coordinate service-impacting incidents across technical, operational, business, client-facing, and external partner teams.
  • Ability to lead major incident triage, clarify ownership, drive accountability, and translate technical activity into clear timelines, communications, executive summaries, and post-incident reviews.
  • Strong sense of urgency with the ability to remain organized, calm, and action-oriented during high-pressure or time-sensitive service-impacting events.
  • Sound judgment in applying process authority, escalating service risk, engaging leadership, initiating communication workflows, and distinguishing Incident Management coordination from technical resolution ownership.

Monitoring and Observability

 

  • Familiarity with observability, monitoring, alerting, service health, and operational visibility concepts.

 

General / Cross-Functional Operations

 

  • Strong working proficiency with ServiceNow or a similar ITSM platform, including incident workflows, SLA tracking, reporting, dashboards, and process improvement.
  • Effective communication skills with the ability to translate technical details into clear operational, business-impact, leadership, and client-facing updates.
  • Strong analytical and documentation skills, including incident trend analysis, SLA performance review, postmortems, SOPs, status updates, operational reports, and audit-ready records.
  • Ability to lead through influence, clarify ownership, drive accountability, and escalate risks, blockers, non-response, or service-impacting issues in a matrixed environment.
  • Strong organizational skills with the ability to manage multiple priorities, incidents, reviews, and follow-up items under time-sensitive conditions.
  • Collaborative, customer-focused, and service-reliability mindset with the ability to recommend practical improvements to incident workflows, escalation paths, SLA models, reporting logic, documentation standards, and operational communications.

 

KNOWLEDGE, EXPERIENCE, AND/OR EDUCATION REQUIREMENTS


Required:

 

  • Bachelor’s degree in Information Technology, or a related field, or an equivalent combination of education and experience.
  • Minimum 5 years of experience in IT Service Management, IT Operations, Incident Management, Service Delivery, or a related corporate technology environment.
  • Demonstrated experience coordinating Incident Management or Major Incident Management processes in a multi-team, hybrid, or enterprise environment.
  • Experience supporting urgent, high-impact, or business-critical incidents, including after-hours escalation or coordination when required by business impact.
  • Experience using or supporting an ITSM platform; ServiceNow experience strongly preferred.
  • Experience with SLA reporting, breach analysis, aging incident review, service performance tracking, or operational dashboard review.
  • Experience preparing incident communications, business-impact updates, postmortem summaries, executive summaries, or operational status reports.
  • Experience collaborating with technical teams, business stakeholders, service owners, vendors, and leadership stakeholders.
  • Practical understanding of Incident, Problem, Change, Request, Knowledge, and SLA Management processes.
  • Ability to support high-touch incidents or service-impacting events involving clients, business stakeholders, or executive visibility.

 

 

 

Preferred:

 

  • ITIL Foundation certification or equivalent knowledge of ITIL principles.
  • Experience supporting Major Incident Management, post-incident reviews, Problem Management, root cause follow-up, or corrective action tracking.
  • Experience supporting SLA/OLA governance, reporting, breach review, or process improvement.
  • Experience supporting operational communications for leadership, business stakeholders, or client-facing teams.
  • Experience improving incident workflows, escalation paths, documentation standards, or service reporting.
  • Experience supporting incident response, service restoration, or operational troubleshooting in AWS-hosted or hybrid cloud environments.
  • Experience working in a complex, client-facing, or high-availability technology environment.

 

 

Disclaimer: This job description is not intended to be an exhaustive list of all duties, responsibilities, or qualifications associated with the job. Management reserves the right to modify or reassign job duties as business needs evolve.

 

#LI-RZ1 

#LI-Remote

Empyrean is an Equal Opportunity Employer: including disability and protected veteran status

Similar Jobs

See all Remote Others jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Others

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified