Site Reliability Engineer - Technology
Posted on Nov 20, 2018 by Nike
Nike, Inc. Technology is responsible for making the world's largest sport brand run faster, smarter and more securely. From infrastructure to security and supply chain operations, Technology specialists drive growth through top-flight hardware, software and enterprise applications. Global Technology aggressively innovates solutions to drive growth while creating and implementing tools that help make everything else in the company possible.
As the Site Reliability Engineer, you will use your expertise in full stack application support to provide meaningful site health and performance metrics, that will continuously drive reliability and consumer experience improvements. You will work across teams to proactively identify and troubleshoot incidents within the stack (consumer and non-consumer facing). The application support team provides day-to day operational activities in a dynamic global 24x7 environment. Important to the role is the ability to identify, evaluate, and execute preventive measures to minimize/avoid impact to the consumer experience.
As a reliability engineer, you will enable and lead the critical incident and problem management process where you will engage appropriate colleagues, vendors, and leadership teams to restore service, manage root cause analysis and recommend solutions for long term fix. You will also provide support during product launches and releases, prepare reports for management, identify risks/solutions, and make recommendations for continual improvement.
What we're looking for:
- Practical expertise in managing and leading application reliability practices for consumer facing web and mobile experiences
- Ability to work across teams to continuously analyze system performance in production, troubleshoot consumer reported issues, and proactively identify areas in need of optimization
- Previous experience with developing and driving real time monitoring solutions that provide visibility into site health and key performance indicators
- Working understanding of IT service management (Incident, Problem, Change and Knowledge management
- Ability to lead a technical team of support engineers through day to day operations and critical incidents
- Prior experience with agile methodologies, performance engineering and automation tools
- Highly confident and capable in reporting and communicating high value metrics to leadership. Deep understanding of the business landscape and how site reliability influences our consumers.
- Familiarity with most of the following: Java, ServiceNow, Splunk, New Relic, Science Logic, Cloud computing, VMs, Windows, Linux and AWS
- Background with ITIL or Lean a plus
- 3-7 years' technical experience working with consumer facing (e-commerce) software applications
- Bachelor's Degree in Computer Science, Engineering, IT or a related field; MBA a plus. 2 additional years of experience in lieu of a degree.