Head of Site Reliability Engineering
Posted on Nov 21, 2018 by STATS
Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. The main function of the SRE team is to be responsible for the availability, performance, monitoring, and incident response for STATS' internally critical and our customer-facing systems.
What You'll Do:
Lead a team of SRE engineers to:
- Engage in and improve the whole life cycle of services-from inception and design, through deployment, operation and refinement
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health
- Scale systems through sustainable mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocit.
- Practice sustainable incident response and blameless postmortems.
- Establish a mindset and a set of engineering approaches to running better production systems with focuses on optimizing existing systems, building infrastructure and eliminating work through automation
- Establish a culture of diversity, intellectual curiosity, problem solving and openness to ensure team success
- Create an environment that provides the support and mentorship needed to learn and grow
Skills & Requirements
What You'll Need:
- Experience of large-scale cloud implementation and cloud-native application architecture
- Experience of building and leading software engineering teams
- Experience in managing cloud-based application and platform production environment and processes
- BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience
- Experience with algorithms, data structures, complexity analysis and software design
- Experience in one or more of the following: C, C++, Java, Python, Go, Perl or Ruby
Who We Are:
The values at STATS mirror those of foundationally great teams and franchises. Our objective is to win through effort, creativity, team-work and positive energy. Specifically, we look for candidates that embody the following: Be All-In, Put the Fan at the Center, Get Stuff Done, It's Your Team, Make an Impact, and Fearless Integrity. We want employees that crave responsibility, accountability and want to have fun working in a collaborative, Get Stuff Done environment.
- Put Fan at the Center: You enjoy working with customers and have the strong communication skills to make those interactions a success.
- It's Your Team: You embrace others' ideas (even if they conflict with your own) for the sake of the company and customer. You are a collaborator and relationship builder.
- Get Stuff Done: You are driven and your can-do attitude inspires others to elevate their performance in a fast-moving environment.
- Make an Impact: You thrive in a fast-paced, changing environment and you're excited by the chance to play a large role.
- Be All-In: You must be passionate about what you do and about our customers' success.
- Fearless Integrity: You are self-motivated and capable of holding both yourself and others accountable to deliver on multiple tasks.
STATS provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, sexual orientation, national origin, age, disability or genetics.
Request Technology - Robyn Honquest
Request Technology - Craig Johnson
Request Technology - Anthony Honquest
University of Illinois at Chicago