Site Reliability Engineering (Must Have Active SC)
Posted on Sep 22, 2022 by J & C Associates Ltd
We are IT Recruitment Specialists partnered with a prestigious Global Consultancy who requires a Site Reliability Engineering (Must Have Active SC) for one of their Public sector clients based in Locations - Blackpool, Manchester, Leeds, Sheffield, Newcastle
- Role: Site Reliability Engineering
- Contract Length: 6 months
- Location: hybrid 40% in the office: Locations - Blackpool, Manchester, Leeds, Sheffield, Newcastle
- IR35: Inside
- Security Clearance: SC
Summary of the Role and key responsibilities
- As a Senior Site Reliability Engineer you will drive adoption of SRE best practice across our cloud estate. Utilising both your soft skills and technical experience, you will work with teams to ensure our standards and governance is met. By onboarding our services into the cloud, through a dedicated assessment stage gate process. So that in turn our citizen facing applications satisfy all the required operational and security needs for running in production. You will execute deployments using runbooks, investigate production incidents and provide dedicated support to teams to determine the root cause. You will provide an on-call service to help restore services, through dedicated run books or technical experience. You will help to reduce toil and increase automation; by developing reliability to ensure we have a reduction of the time to live, and cost spend on repetitive tasks. You will provide guidance and influence best practice to our development teams.
- Responsible for contributing authoritative advice and guidance to others in the organisation and externally.
Develop your technical skills and you must have the following knowledge:
- Essential - Terraform, Ansible, Python, Bash, Gitlab CI/CD, AWS or Azure managed services, Monitoring services (Cloudwatch/Prometheus/Azure Monitor), Containers
- Desirable - Kubernetes, System administration (RHEL, Windows Server, etc), Network configuration (DNS, routing, load balancing, etc)
- Design and develop the techniques for improving application reliability, run books, knowledge transfer to the UXCC, and ongoing SRE strategy within your Functional and Professional Communities
- Act as the focal point for the investigation and resolution of major or complex incidents for the service, ensuring people with the right skills and expertise are proactively available to respond effectively
- Assess the impact of change requests in consultation with stakeholders, providing technical expertise and authorising the implementation of subsequent changes
- Be on-call for applications that require out-of-hours SRE coverage
- Undertake comprehensive analysis of performance trends to identify root cause analysis, progressing opportunities to improve reliability, security, capability of infrastructure, application and site services
- Actively engage with senior stakeholders and provide clear communication of incident resolution and service improvements.
- Assure critical changes to the applications and supporting infrastructure
- Develop and maintain relevant knowledge such that it can be easily annotated, updated, referenced, and consumed
- Conduct code assessments, with a view to correcting errors and providing recommendations for reliability improvements
- Manage the team backlog for the applications for which you are accountable
- Coach and mentor application development and operations engineers in the practice and techniques of SRE
- Conduct reflectives for all high priority and major incidents ensuring they are done quickly and published
- Routinely seek views and capture ideas from stakeholders and team members for improvements and encourage collaboration and innovation
- Interdepartmental discussions and meetings with a wide variety of external bodies and organisations on a local, regional, national or international basis, leading community discussions about SRE best practice within Engineering.
If you are interested in this position and would like to learn more, please send through your CV and we will get in touch with you as soon as possible. Please note, candidates are often Shortlisted within 48 hours.