Responsible for advanced server implementation, troubleshooting, and escalation support for a diverse client base in a remote MSP environment. The role involves managing virtualization, cloud platforms, and identity services while utilizing AI tools to improve operational efficiency.
SUMMARY
The Senior Systems Engineer is a senior technical contributor responsible for
advanced server and systems implementation, troubleshooting, and escalation
support across a diverse client base within a remote MSP environment. This role
handles complex infrastructure incidents escalated from Tier 1, drives
proactive improvements, and serves as a subject matter expert for Windows and
Linux server administration, virtualization, identity, storage, backup, and
cloud platforms. The engineer works closely with clients and internal teams to
deliver high-quality, reliable managed services, and is expected to use AI
tools to work faster and raise the quality of their work.
JOB RESPONSIBILITIES
- Build, configure, and
administer Windows Server (2016, 2019, 2022, 2025) and Linux servers (RHEL,
CentOS, Ubuntu) across multiple client environments.
- Administer and
troubleshoot Microsoft Active Directory, Group Policy, DNS, DHCP, ADFS, and
Remote Desktop Services, including forest and domain architecture, trusts,
replication, and FSMO roles.
- Manage and optimize
virtualization platforms including VMware vSphere (ESXi, vCenter, vSAN),
Microsoft Hyper-V (failover clustering, live migration), and Nutanix HCI (AHV,
Prism).
- Administer Microsoft
365 (Exchange Online, SharePoint, Teams, Entra ID) and hybrid identity
configurations.
- Support and
troubleshoot cloud workloads in AWS and Azure, including IaaS, PaaS, and hybrid
architectures.
- Manage enterprise
storage (SAN, NAS, iSCSI, Fibre Channel, NFS) and backup and disaster recovery
platforms such as Cove, Metallic, and Veeam; execute DR tests and validate
recovery.
- Lead advanced
troubleshooting and root cause analysis for complex server, virtualization,
identity, and performance issues.
- Lead patching,
firmware upgrades, and lifecycle management for servers and infrastructure
appliances across the client base.
- Develop and maintain
monitoring strategies using LogicMonitor or equivalent platforms; create custom
alerts and dashboards and remediate issues before client impact.
- Manage and resolve
escalated tickets within MSP platforms (ConnectWise, Dell Kace, FreshService);
document resolutions thoroughly.
- Conduct system
assessments and assist with architecture diagrams, runbooks, and technical
documentation for client environments.
- Conduct periodic
security posture reviews including vulnerability assessments, configuration
audits, and hardening recommendations aligned to NIST, CIS, or client specific
frameworks.
- Use AI tools to
accelerate troubleshooting, scripting, documentation, and research, while
reviewing AI output for accuracy before applying it in client environments.
- Build automation and
scripts (PowerShell, Python, Bash) to reduce manual effort and improve
consistency, including AI-assisted script development.
- Evaluate emerging
platforms, tooling, and AI-driven capabilities on behalf of the MSP and
recommend adoption into the standard service stack.
- Serve as the primary
escalation point for server and systems incidents, own resolution from
escalation through closure.
- Act as primary
escalation for P1 and P2 outages, coordinating with cross-functional teams and
keeping management and stakeholders informed.
- Lead change management
activities including planning, scheduling, implementing, and validating
production changes.
- Participate in on-call
rotation and after-hours maintenance windows as required.
- Collaborate with
hardware vendors, software vendors, and cloud providers to coordinate RMAs,
support cases, and complex issue resolution.
- Assist in onboarding
new MSP clients including discovery, documentation, and transition planning.
- Identify recurring
issues and drive proactive improvements to reduce incident volume and improve
client stability.
QUALIFICATIONS
- 5
to 7 or more years of hands-on systems or infrastructure engineering
experience, preferably in an MSP or multi-client managed services environment.
- At
least 2 years in a senior or lead technical capacity.
- Deep
expertise across Windows Server, Active Directory, and at least one major
virtualization platform (VMware, Hyper-V, or Nutanix).
- Strong
scripting and automation skills (PowerShell, Python, Bash) and a track record
of using automation to improve operations.
- Demonstrated
use of AI tools (such as Microsoft Copilot, ChatGPT, Claude, or similar) for
troubleshooting, scripting, or documentation, and the ability to use them
responsibly is strongly preferred.
- Familiarity
with compliance frameworks (HIPAA, SOC 2, NIST, PCI-DSS) as they apply to
systems security controls.
- Exposure
to ITSM and ITIL practices within a managed services delivery model.
- Degree
in Information Technology, Computer Science, Systems Engineering, or equivalent
experience.
- Required
or highly preferred certifications: Microsoft Windows Server Hybrid
Administrator Associate, Azure Administrator Associate, MCSE, or equivalent.
- Additional
desirable certifications: VMware VCP-DCV, Nutanix NCP, AWS Solutions Architect
or SysOps Administrator, CompTIA Server+/Security+, or ITIL Foundation.
- Excellent
verbal and written communication skills required for client-facing interactions
and technical documentation.
- Ability
to convey complex technical concepts to non-technical stakeholders clearly and
professionally.
- Strong
time management and the ability to handle multiple client priorities at once in
a fast-paced remote MSP environment.
- Collaborative
mindset with a commitment to mentoring Tier 1 engineers and contributing to
team knowledge sharing.
- Curiosity
and willingness to adopt AI tools in daily work, with good judgment about when
to rely on them and when to verify.
- Self-directed
and accountable, comfortable working independently in a fully remote setting
with minimal supervision.
JOB REQUIREMENTS
- Should be willing to
accept a long-term work-from-home arrangement.
- Should be amenable to
a permanent night shift schedule.