This Job Vacancy has Expired!

Site Reliability Engineer

Project Recruit

Posted on Sep 23, 2022 by Project Recruit

London, United Kingdom
IT
Immediate Start
Annual Salary
Contract/Project

Site Reliability Engineer

Our client, a leading global supplier for IT services requires a Site Reliability Engineer- Virtualisation SME based at their client's offices in London.

You may be able to work some days remotely.

This is a 1 year temporary contract to start ASAP.

Day rate: Competitive market rate

We are looking for a Site Reliability Engineer- Virtualisation SME with 10+ years of experience having excellent knowledge of ESX VMWare and/or Nutanix HCI and of container orchestration platforms such as Docker and Kubernetes:

Key Responsibilities

  • Responsible for the reliability and efficiency of virtualisation infrastructure through the delivery of common, repeatable tools and processes that greatly reduce the amount of toil the OS and DB Platform Operations team must perform
  • Responsible for writing software to make the virtualisation infrastructure self-managing and self-service.
  • Responsible for automation and continuous service improvement by developing Infrastructure as Code.
  • Responsible for elimination of manual, repetitive, automatable, tactical tasks that are devoid from value.
  • Responsible for availability, latency, performance, efficiency, change management, monitoring and capacity planning.
  • Responsible for improving system performance, making effective use of resources, distributing load and reducing latency.
  • Responsible for identifying SLO's (Service Level Objectives) that align the team to meet availability and latency objectives.
  • Responsible for developing pro-active monitoring solutions that alert on symptoms and not just on outages.
  • Responsible for performing detailed root cause analysis (RCA's) on incidents and outages to prevent future occurrence.
  • Responsible for partnering with development teams to improve services via rigorous testing and release procedures.
  • Responsible for actively sharing knowledge and best practices across the organisation.
  • Responsible for identifying technical debt and partner with application teams to build remediation plans.
  • Responsible for developing standard operational procedures and producing effective documentation.
  • Responsible for analysing workloads and devising suitable cloud migration strategies where appropriate.
  • Responsible for participating in on-call rotation, triaging and addressing production issues as they arise.
  • Responsible for performing the OS Platform Operations function as and when required.
  • Responsible for mentoring and developing less experienced SA's and SRE's.
  • Responsible for identifying cost saving and optimisation opportunities within the customer business.
  • Responsible for building strong relationships across the customer functions and business areas, underpinned by trust and the core values of the customer.

Key Skills

Essential:

  • Excellent knowledge of ESX VMWare and/or Nutanix HCI.
  • Excellent knowledge of Windows Server 2008/2012/2016/2019.
  • Excellent knowledge of Windows OS tuning utilities and commands.
  • Excellent knowledge of configuring Windows OS systems for optimal performance.
  • Excellent knowledge of Windows clustering and high-availability solutions.
  • Excellent knowledge of Microsoft Active Directory, LDAP and Kerberos.
  • Excellent knowledge of TCP/IP Networking Protocols.
  • Excellent knowledge of networking, storage, database and virtualization layers.
  • Excellent knowledge of container orchestration platforms such as Docker and Kubernetes.
  • Excellent knowledge of version control software such as GitHub and Subversion.
  • Excellent knowledge of configuration management software such as Chef, Puppet, Ansible, Terraform and SaltStack.
  • Excellent knowledge of "Infrastructure as Code" principles and practices.
  • Excellent knowledge of continuous integration (CI) and continuous development (CD) principles and practices.
  • Excellent knowledge of applications development using Agile, and DevOps best practices.
  • Excellent knowledge of operating system security and auditing methods.
  • Excellent knowledge of security hardening principles in line with CIS industry benchmarks.
  • Excellent knowledge of data security governance and regulations such as GDPR and SOX.
  • Excellent knowledge of cloud computing - IaaS, PaaS and SaaS offerings across Azure, AWS, GCP and Oracle.

Desirable:

  • Good working knowledge of RedHat Enterprise Linux (6.x, 7.x, 8.x) and Solaris (10.x and 11.x).
  • Good working knowledge of Unix/Linux OS tuning utilities and commands.
  • Good working knowledge of Unix/Linux system internals and Kernel tuning for optimal performance.
  • Good working knowledge of Red Hat Satellite.
  • Good working knowledge of Anti-Virus software such as McAfee and Sophos.
  • Good working knowledge of Ivanti LANDESK and Symantec Altiris.
  • Good working knowledge of ThinPrint and EquiTrack (Follow-Me Printing).
  • Good working knowledge of Rubrik.
  • Good working knowledge of EMC, HDS and Pure storage arrays.
  • Good working knowledge of Dell PowerEdge, IBM xSeries and Cisco UCS hardware.
  • Good working knowledge of EMC Networker, Data Domain and IBM Tivoli Storage Manager.
  • Good working knowledge of Infoblox DNS.
  • Good working knowledge of Icinga 2 and OpManager.
  • Good working knowledge of IBM Tivoli and Netcool.
  • Good working knowledge of GitHub, Subversion and TeamCity.
  • Good working knowledge of BMC Control-M.
  • Good working knowledge of CyberArk.
  • Good working knowledge of Splunk and IBM QRadar.
  • Good working knowledge of Qualys.
  • Good working knowledge of SharePoint, JIRA and Confluence.
  • Good working knowledge of ServiceNow and Serena Business Manager.

Candidate Specifications

  • Excellent communication and interpersonal skills
  • Ability to handle pressure during outages and systematically resolve issues
  • Excellent problem-solving skills
  • Results driven, with a strong sense of accountability
  • A proactive, motivated approach
  • The ability to operate with urgency and prioritise work accordingly
  • A structured and logical approach to work
  • Attention to detail and accuracy
  • Ability to perform well in a pressurised environment
  • Ability to manage constructive conflict effectively
  • The ability to manage large workloads and tight deadlines
  • Able to communicate complex technical concepts to non-technical persons at all levels

Reference: 1738078922

Set up alerts to get notified of new vacancies.

Similar Jobs

Site Reliability Engineer

London, United Kingdom

Annual Salary

Site Reliability Engineer

London, United Kingdom

£45k - £60k Annual

Site Reliability Engineer

London, United Kingdom

Annual Salary

Site Reliability Engineer

Not Specified, United Kingdom

€115k - €115k Annual

Site Reliability Engineer

Exeter, Devon, United Kingdom

Annual Salary

Site Reliability Engineer

Exeter, Devon, United Kingdom

£45k - £60k Annual

Site Reliability Engineer

Whitton, Greater London, United Kingdom

Annual Salary

Site Reliability Engineer

London, United Kingdom

Annual Salary

Site Reliability Engineer

Manchester, Greater Manchester, United Kingdom

£40k - £50k Annual

Site Reliability Engineer

Glasgow, City of Glasgow, United Kingdom

£55k - £80k Annual