Site Reliability Engineer

CV-Library

Posted on Feb 3, 2025 by CV-Library
Birmingham, West Midlands (County), United Kingdom
IT
Immediate Start
£65k - £70k Annual
Full-Time
Site Reliability Engineer

Permanent

£70,000

Midlands based/Hybrid working

As the Site Reliability Engineer you will be joining the clients Platform Engineering Team to help build, manage, and support some of the clients core infrastructure.

Key areas of responsibilities:

Ensuring the platform services meet high standards for availability, reliability, and performance
Defining and promoting best practices for observability, incident management, and operational processes
Leading incident management efforts
Partner with platform engineers and product teams
Develop and maintain monitoring, logging, and alerting solutions to provide actionable insights into platform health and performance

Key Skills

You will have a deep understanding of concepts such as SLAs, SLOs, and error budget
You will have expertise in tools such as Prometheus, Grafana, Loki, or similar
You will have experience in leading incident response processes, including root cause analysis and implementing preventative measures
You will be proficient in scripting languages (e.g., Python, Bash)
You will need to work effectively with cross functional teams
You will be a problem solver

Reference: 222985236

https://jobs.careeraddict.com/post/99507398

This Job Vacancy has Expired!

CV-Library

Site Reliability Engineer

CV-Library

Posted on Feb 3, 2025 by CV-Library

Birmingham, West Midlands (County), United Kingdom
IT
Immediate Start
£65k - £70k Annual
Full-Time
Site Reliability Engineer

Permanent

£70,000

Midlands based/Hybrid working

As the Site Reliability Engineer you will be joining the clients Platform Engineering Team to help build, manage, and support some of the clients core infrastructure.

Key areas of responsibilities:

Ensuring the platform services meet high standards for availability, reliability, and performance
Defining and promoting best practices for observability, incident management, and operational processes
Leading incident management efforts
Partner with platform engineers and product teams
Develop and maintain monitoring, logging, and alerting solutions to provide actionable insights into platform health and performance

Key Skills

You will have a deep understanding of concepts such as SLAs, SLOs, and error budget
You will have expertise in tools such as Prometheus, Grafana, Loki, or similar
You will have experience in leading incident response processes, including root cause analysis and implementing preventative measures
You will be proficient in scripting languages (e.g., Python, Bash)
You will need to work effectively with cross functional teams
You will be a problem solver

Reference: 222985236

CareerAddict

Alert me to jobs like this:

Amplify your job search:

CV/résumé help

Increase interview chances with our downloads and specialist services.

CV Help

Expert career advice

Increase interview chances with our downloads and specialist services.

Visit Blog

Job compatibility

Increase interview chances with our downloads and specialist services.

Start Test

Similar Jobs

Site Reliability Engineer

Chaucer, Greater London, United Kingdom

Site Reliability Engineer

Hereford, Herefordshire, United Kingdom

Site Reliability Engineer - SRE Consultant

City of London, City and County of the City of London, United Kingdom

Site Reliability Engineer - SRE Consultant

City of London, City and County of the City of London, United Kingdom