Site Reliability Engineer(SRE)

Apply for this position Please mention DailyRemote when applying
timePosted 3 days ago location United States salarySalary undisclosed
Before you apply - make sure the job is legit.

Attempting to apply for jobs might take you off this site to a different website not owned by us. Any consequence as a result for attempting to apply for jobs is strictly at your own risk and we assume no liability.

Job Description

Position Site Reliability Engineer(SRE) Duration Full Time Location New York (100 Remote) Candidate will have to work as per the Eastern Time The Site Reliability Engineer is responsible for the availability and reliability of critical platform services and applications, ensuring they meet the requirements of internal and external users. This position offers the chance to positively impact patient outcomes in healthcare by ensuring the availability and reliability of services used by healthcare providers to distribute important medical data. In addition, this position is a ground-floor opportunity to be instrumental in the transformation of an industry leader's offerings. You will be a key contributor in the transition from running services in a data center to providing services and capabilities that are scalable, always available and cloud-native. If you enjoy solving hard problems and want to be a part of a legacy that impacts our world by improving patient outcomes, this job may be for you! Main Responsibilities Creates solutions using cloud technologies to solve client technical and business challenges Participates in system design consulting, platform management, and capacity planning Implements new tools and techniques to increase scalability and performance Architects and builds automation tools to increase reliability and speed Implements CICD solutions to increase traceability and developer experience Implements monitoring and logging systems. Gathers and analyzes metrics from both operating systems and applications to assist in performance tuning and fault finding Measures and optimizes system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve Partners with development teams to improve services through rigorous testing and release procedures Creates sustainable systems and services through automation of tasks and uplifts Builds software and systems to manage platform infrastructure and applications Improves reliability, quality, and time-to-market of our suite of software solutions Maintains or refactors existing processes to align with serverless architecture. Plans and conducts technical tasks associated with the implementation and maintenance of internal cloud enterprise-shared virtualization infrastructure. Deploys software to cloud computing infrastructure, and works with system configuration and deployment automation technologies, working with ETL tools and techniques. Performs the implementation, operational support, maintenance, and optimization of network hardware, software, and communication links of the cloud infrastructure. Resolves complex problems creates and improves procedures, and facilitates communication. Education Experience Bachelor's degree preferred A minimum of 5 years of experience as a Site Reliability Engineer. Strong attention to detail Demonstrated oral and written communication skills. Ability to work independently and meet deadlines. ability to work effectively in a cross-functional team demonstrated ability to partner with other departments. Excellent teamwork skills. Ability to react to change productively. Ability to program in one or more high-level languages (Examples Java, Python, JavaScript). Understanding of networking concepts both in self-managed and cloud environments Experience with designing and implementing distributed systems (Examples Kubernetes, Functions-as-a-Service). Experience implementing with and using Infrastructure-as-Code and Configuration Management tooling (Examples Terraform, AWS Cloud Formation, Puppet, Ansible, ARM, Google Cloud Deployment Manager). Experience with DevOpsGitOps tooling (Examples Jenkins, CircleCI, Bamboo, Gitlab CI, GitHub Actions, GoCD). Experience with scripting systems (Examples Python, PERL, Bash, PowerShell) Experience with Application Performance Monitoring, alerting, notification and reporting tools (Nagios, Prometheus, ELK, New Relic, AppDynamics, OpsGenie). Experienced and comfortable coordinating live incident calls. Experienced and comfortable coordinating and documenting Root Cause Analysis investigations. Regards, Chris Fernandes Technical Recruiter Endure Technology Solutions Tel Email mailto www.endure.tech httpwww.endure.tech