This Job Vacancy has Expired!

Reliability Engineer (React, Node, Salesforce, Integrations, SAP, AWS, CI/CD, Jenkins) - Remote

Octopus Computer Associates

Posted on Mar 31, 2021 by Octopus Computer Associates

Not Specified, Belgium
IT
Immediate Start
Annual Salary
Contract/Project


Reliability Engineer (React, Node, Salesforce, Integrations, SAP, AWS, CI/CD, Jenkins) - Remote - 9 months+

(Problem Management, Capacity Management)

One of our Blue Chip Clients is urgently looking for a Reliability Engineer (React, Node, Salesforce, Integrations, SAP, AWS, CI/CD, Jenkins)

For this role you can work remotely.

Please find some details below:

Description: Client is looking for a senior profile who has good solution architecture skills as well as is familiar with React, node, Salesforce, integrations, SAP and AWS (lambda). The expectation is not that he/she will be coding, but he/she should be able to challenge the team on all different aspects - this to ensure a qualitative and reliable application which runs stable, so aspects like monitoring, transactions, ... Knowledge of CI/CD Jenkins.

He/she also needs to be communicative as well as technically sound enough to challenge the team and drive improvements.

The Role: The position is responsible and accountable for the reliability and availability of the Car T Business Critical platform end to end. This requires a full stack approach starting from the Front End applications, the Back End systems, integration/API's to the hosting components. Therefore, a proactive monitoring system/approach will be crucial to work in a proactive way with the service owners of the underlying services making up the Car T Business Critical platform.

The ARE is responsible for problem, availability and capacity management and for the selection of components for continuity and disaster recovery. Ensures the same issues do not re-occur repeatedly, and business disruptions decrease over time.

Key Responsibilities

- Problem Management:

o Manage all problems from the time they are detected, throughout their resolution and closing in to eliminate recurring incidents and to minimize the impact of incidents that cannot be prevented.

o Responsible for full Problem Management Lifecycle (Record, Classify, Prioritize/Investigate and Diagnose/Resolve Problem/Close Problem).

o Manages Specialist team resources towards Problem root cause and resolution

- Availability Management:

o Ensure that the level of service availability delivered in all services is matched to or exceeds the current and future agreed needs of the business, in a cost-effective manner.

o Accountable for driving down TTR (Time to Resolve) on major incidents affecting availability.

o Responsible for end-to-end Lifecycle for Availability Management (Plan & Design for Availability, Risk Assessment, Implement Countermeasures, Test Availability & Resilience Mechanisms, Monitor, Measures, Analyze, & Report Availability).

o Designs, develops, owns monitoring solutions in alignment with the Reliability engineers of the different platforms making up the Car T Business Critical platform

o Collaborate with Solution owners, other service owners and release management for all platforms making up the Car T Business critical platform

o Works with delivery/build team to ensure application is maintainable and new development does not negatively impact production.

- Capacity Management:

o Ensure that cost-justifiable IT capacity always exists and is matched to the current and future agreed needs of the business.

o Responsible for Capacity Lifecycle management in conjunction with Infrastructure to analyze, tune & implement capacity as needed. Perform application sizing and plan & optimize.

- Continuous Service Improvement:

o Monitor and measure the quality of IT operations, benchmark these metrics, and initiate improvement actions when they do not meet minimum requirements.

o Responsible for documentation and incorporation of non-functional requirements in application development.

o Drive automation to reduce tickets and eliminate human error.

o Assist Solution owner with estimates and quotes for the application maintenance service.

o Works with Rapid Response teams to minimize TTR on major incidents and to identify chronic/recurring issues that require problem management

o Works with delivery/build team to ensure application is maintainable and new development does not negatively impact production.

- Continuity/Disaster Recovery

o Coordinate with Infrastructure team and Disaster recovery team to provide input in ITSCM Plans and Testing Activities.

Qualifications

o Relevant experience preferred in Healthcare industry

o Strong verbal and written communication skills

o Strong leadership skills, facilitating escalated issues to resolution

o Must be able to solve complex business problems and present recommendations to senior management effectively

o Extensive knowledge in monitoring, automation, orchestration and other management tools.

o Experience in 1 or more of the following technology solutions is required: Salesforce, Cloud Native development, ERP SAP, integration/API design

o Proficiency in high-level coding languages like React, Node.js, Python

o Understanding of network layers, transaction routing and monitoring thereof

o Experience with CI/CD, configuring deployment pipelines, release management

o Skilled in performance testing and analysis of results

Please send CV for full details and immediate interviews. We are a preferred supplier to the client.




Reference: 1146634623

Set up alerts to get notified of new vacancies.