Windows Site Reliability Engineer

Project Recruit

Posted on Sep 23, 2022 by Project Recruit

London, United Kingdom
Immediate Start
Annual Salary

Windows Site Reliability Engineer

Our client, a leading global supplier for IT services requires a Windows Site Reliability Engineer based at their client's offices in London.

You may be able to work some days remotely.

This is a 1 year temporary contract to start ASAP.

Day rate: Competitive market rate

We are looking for a Windows Site Reliability Engineer with 10+ years' work experience and an advance working knowledge of Kubernetes and Docker container orchestration:

Key Responsibilities


  • Develop software to make infrastructure services self-managing and self-service
  • Deliver continuous service improvement by developing Infrastructure as Code
  • Eliminate manual, repetitive, automatable, tactical tasks that are devoid from value
  • Improve system performance, make effective use of resources, distribute load and reduce latency
  • Identify SLO's (Service Level Objectives) to meet availability and latency objectives
  • Develop pro-active monitoring solutions that alert on symptoms and not just on outages
  • Perform detailed root cause analysis (RCA's) on incidents and outages to prevent future
  • Partner with development teams to improve services via rigorous testing and release procedures
  • Identify technical debt and partner with application teams to build remediation plans
  • Develop standard operational procedures and produce effective documentation
  • Analyse workloads and devise suitable cloud migration strategies where appropriate
  • Ensure all project/investment workloads are delivered according to plans and budget defined
  • Liaise with Infrastructure Control and IT Risk teams to satisfy internal and external audit requests
  • Deputise for team lead when required to do so and act-up accordingly
  • Identify cost saving and optimisation opportunities across the group
  • Build strong working relationships across the organisation
  • Adhere to the core values of the customer


  • Perform daily health and compliance checks for all systems as required
  • Ensure all systems are backed up successfully and any issues are promptly resolved
  • Validate monitoring alerts and batch job failures are detected promptly and satisfactorily resolved
  • Ensure sufficient capacity is available to accommodate drive growth
  • Respond to emails sent to the team distribution list/mailboxes in a timely manner
  • Handle incidents and requests with efficiency and a "customer first" mindset
  • Maintain infrastructure in a highly available, reliable, secure and performant manner
  • General Server/Database/Virtualization Administration maintenance activities
  • Provide technical support to application support and development teams
  • Provide consultancy to application support and development teams
  • Take part in On-Call & weekend work rotation; triaging and addressing production issues as they arise

Key Skills

Highly Desirable:

  • Experience on writing, managing plays/playbooks on AWX/Ansible Tower
  • Advance working knowledge of Kubernetes and Docker container orchestration
  • Microsoft SQL Server, Oracle, Sybase ASE, MongoDB and Snowflake
  • IBM Tivoli/Netcool
  • Nutanix HCI and VMWare ESX
  • Networking Protocols (TCP/IP, DNS, DHCP, VLAN's)
  • RHEL, Oracle Linux, Oracle Solaris and related technologies
  • Cloud computing - IaaS, PaaS and SaaS offerings across Azure, AWS, GCP and Oracle
  • Knowledge of data security governance and regulations such as GDPR and SOX


  • Dell EMC PowerStore (SAN) and Isilon (NAS)
  • Rubrik, EMC Networker, Data Domain and IBM Tivoli Storage Manager
  • CyberArk
  • Splunk
  • Qualys
  • Cisco Tetration
  • ServiceNow
  • JIRA and Confluence

Candidate Specifications

  • Excellent communication and interpersonal skills
  • Ability to handle pressure during outages and systematically resolve issues
  • Excellent problem-solving skills
  • Results driven, with a strong sense of accountability
  • A proactive, motivated approach
  • The ability to operate with urgency and prioritize work accordingly
  • A structured and logical approach to work
  • Attention to detail and accuracy
  • Ability to perform well in a pressurized environment
  • Ability to manage constructive conflict effectively
  • The ability to manage large workloads and tight deadlines
  • Able to communicate complex technical concepts to non-technical persons at all levels

Reference: 1738078812

Set up alerts to get notified of new vacancies.

Similar Jobs

M&E Engineer

St James's Square, Greater London, United Kingdom

£35k - £40k Annual


Croydon, Greater London, United Kingdom

£28k - £31k Annual

Mechanical Maintenance Engineer

Croydon, Greater London, United Kingdom

£35k - £40k Annual

Facade Designer

Bromley Town, Greater London, United Kingdom

£45k - £55k Annual