Site Reliability Engineer
Posted on Jun 11, 2026 by CV-Library
City of London, City and County of the City of London, United Kingdom
Accountancy
Immediate Start
£90k - £90k Annual
Full-Time
Site Reliability Engineer (Cloud & Automation) - London - 2 Days on Site per week.
A leading global financial services organisation is seeking a Site Reliability Engineer (SRE) to drive reliability, automation, and performance across its cloud-hosted platforms.
The Opportunity
This role sits within a high-performing Platform Operations function, acting as a central point of expertise for SRE methodologies and automation. You will play a key role in improving system resilience, scalability, and operational excellence across a complex, regulated environment.
Key Responsibilities
Lead the implementation of SRE best practices across cloud infrastructure
Drive improvements in observability, alerting, and capacity planning (SLA / SLO / SLI)
Identify and reduce operational toil through automation and remediation frameworks
Build and enhance GitOps and Infrastructure-as-Code capabilities (e.g. Terraform, Ansible)
Develop and review production-grade code to support automation initiatives
Support incident management and on-call processes, ensuring production stability
Contribute to post-incident reviews, embedding SRE principles to reduce risk
Requirements
Demonstrable experience in SRE or infrastructure operations within cloud environments (AWS / GCP)
Strong scripting skills (Python, Ansible, or PowerShell)
Experience with Infrastructure as Code and GitOps methodologies
Hands-on knowledge of observability / APM tools (e.g. Grafana, Datadog, Dynatrace)
Proven experience managing incidents, root cause analysis, and on-call support
Understanding of SLA/SLO/SLI frameworks and reliability engineering principles
Desirable
Background in software development
Experience working within regulated financial services environments
Familiarity with ITIL and enterprise service management frameworks
Relevant certifications (e.g. AWS, Terraform)
Why Apply
Opportunity to shape cloud reliability strategy in a large-scale environment
Work with modern tooling across automation, DevOps, and SRE practices
Strong emphasis on engineering excellence and continuous improvement
Competitive compensation and long-term career progression
To find out more about Huxley, please visit
Huxley, a trading division of SThree Partnership LLP is acting as an Employment Business in relation to this vacancy | Registered office | 8 Bishopsgate, London, EC2N 4BQ, United Kingdom | Partnership Number | OC(phone number removed) England and Wales
A leading global financial services organisation is seeking a Site Reliability Engineer (SRE) to drive reliability, automation, and performance across its cloud-hosted platforms.
The Opportunity
This role sits within a high-performing Platform Operations function, acting as a central point of expertise for SRE methodologies and automation. You will play a key role in improving system resilience, scalability, and operational excellence across a complex, regulated environment.
Key Responsibilities
Lead the implementation of SRE best practices across cloud infrastructure
Drive improvements in observability, alerting, and capacity planning (SLA / SLO / SLI)
Identify and reduce operational toil through automation and remediation frameworks
Build and enhance GitOps and Infrastructure-as-Code capabilities (e.g. Terraform, Ansible)
Develop and review production-grade code to support automation initiatives
Support incident management and on-call processes, ensuring production stability
Contribute to post-incident reviews, embedding SRE principles to reduce risk
Requirements
Demonstrable experience in SRE or infrastructure operations within cloud environments (AWS / GCP)
Strong scripting skills (Python, Ansible, or PowerShell)
Experience with Infrastructure as Code and GitOps methodologies
Hands-on knowledge of observability / APM tools (e.g. Grafana, Datadog, Dynatrace)
Proven experience managing incidents, root cause analysis, and on-call support
Understanding of SLA/SLO/SLI frameworks and reliability engineering principles
Desirable
Background in software development
Experience working within regulated financial services environments
Familiarity with ITIL and enterprise service management frameworks
Relevant certifications (e.g. AWS, Terraform)
Why Apply
Opportunity to shape cloud reliability strategy in a large-scale environment
Work with modern tooling across automation, DevOps, and SRE practices
Strong emphasis on engineering excellence and continuous improvement
Competitive compensation and long-term career progression
To find out more about Huxley, please visit
Huxley, a trading division of SThree Partnership LLP is acting as an Employment Business in relation to this vacancy | Registered office | 8 Bishopsgate, London, EC2N 4BQ, United Kingdom | Partnership Number | OC(phone number removed) England and Wales
Reference: 225234414
https://jobs.careeraddict.com/post/113393136
Site Reliability Engineer
Posted on Jun 11, 2026 by CV-Library
City of London, City and County of the City of London, United Kingdom
Accountancy
Immediate Start
£90k - £90k Annual
Full-Time
Site Reliability Engineer (Cloud & Automation) - London - 2 Days on Site per week.
A leading global financial services organisation is seeking a Site Reliability Engineer (SRE) to drive reliability, automation, and performance across its cloud-hosted platforms.
The Opportunity
This role sits within a high-performing Platform Operations function, acting as a central point of expertise for SRE methodologies and automation. You will play a key role in improving system resilience, scalability, and operational excellence across a complex, regulated environment.
Key Responsibilities
Lead the implementation of SRE best practices across cloud infrastructure
Drive improvements in observability, alerting, and capacity planning (SLA / SLO / SLI)
Identify and reduce operational toil through automation and remediation frameworks
Build and enhance GitOps and Infrastructure-as-Code capabilities (e.g. Terraform, Ansible)
Develop and review production-grade code to support automation initiatives
Support incident management and on-call processes, ensuring production stability
Contribute to post-incident reviews, embedding SRE principles to reduce risk
Requirements
Demonstrable experience in SRE or infrastructure operations within cloud environments (AWS / GCP)
Strong scripting skills (Python, Ansible, or PowerShell)
Experience with Infrastructure as Code and GitOps methodologies
Hands-on knowledge of observability / APM tools (e.g. Grafana, Datadog, Dynatrace)
Proven experience managing incidents, root cause analysis, and on-call support
Understanding of SLA/SLO/SLI frameworks and reliability engineering principles
Desirable
Background in software development
Experience working within regulated financial services environments
Familiarity with ITIL and enterprise service management frameworks
Relevant certifications (e.g. AWS, Terraform)
Why Apply
Opportunity to shape cloud reliability strategy in a large-scale environment
Work with modern tooling across automation, DevOps, and SRE practices
Strong emphasis on engineering excellence and continuous improvement
Competitive compensation and long-term career progression
To find out more about Huxley, please visit
Huxley, a trading division of SThree Partnership LLP is acting as an Employment Business in relation to this vacancy | Registered office | 8 Bishopsgate, London, EC2N 4BQ, United Kingdom | Partnership Number | OC(phone number removed) England and Wales
A leading global financial services organisation is seeking a Site Reliability Engineer (SRE) to drive reliability, automation, and performance across its cloud-hosted platforms.
The Opportunity
This role sits within a high-performing Platform Operations function, acting as a central point of expertise for SRE methodologies and automation. You will play a key role in improving system resilience, scalability, and operational excellence across a complex, regulated environment.
Key Responsibilities
Lead the implementation of SRE best practices across cloud infrastructure
Drive improvements in observability, alerting, and capacity planning (SLA / SLO / SLI)
Identify and reduce operational toil through automation and remediation frameworks
Build and enhance GitOps and Infrastructure-as-Code capabilities (e.g. Terraform, Ansible)
Develop and review production-grade code to support automation initiatives
Support incident management and on-call processes, ensuring production stability
Contribute to post-incident reviews, embedding SRE principles to reduce risk
Requirements
Demonstrable experience in SRE or infrastructure operations within cloud environments (AWS / GCP)
Strong scripting skills (Python, Ansible, or PowerShell)
Experience with Infrastructure as Code and GitOps methodologies
Hands-on knowledge of observability / APM tools (e.g. Grafana, Datadog, Dynatrace)
Proven experience managing incidents, root cause analysis, and on-call support
Understanding of SLA/SLO/SLI frameworks and reliability engineering principles
Desirable
Background in software development
Experience working within regulated financial services environments
Familiarity with ITIL and enterprise service management frameworks
Relevant certifications (e.g. AWS, Terraform)
Why Apply
Opportunity to shape cloud reliability strategy in a large-scale environment
Work with modern tooling across automation, DevOps, and SRE practices
Strong emphasis on engineering excellence and continuous improvement
Competitive compensation and long-term career progression
To find out more about Huxley, please visit
Huxley, a trading division of SThree Partnership LLP is acting as an Employment Business in relation to this vacancy | Registered office | 8 Bishopsgate, London, EC2N 4BQ, United Kingdom | Partnership Number | OC(phone number removed) England and Wales
Reference: 225234414
Share this job:
Alert me to jobs like this:
Amplify your job search:
Expert career advice
Increase interview chances with our downloads and specialist services.
Visit Blog