Site Reliabillty Engineer

Posted on Nov 13, 2018 by Nike

Portland, OR 97201
Leisure & Sport
Immediate Start
Annual Salary
Full-Time
Job Description

Nike, Inc. Technology is responsible for making the world's largest sport brand run faster, smarter and more securely. From infrastructure to security and supply chain operations, Technology specialists drive growth through top-flight hardware, software and enterprise applications. Global Technology aggressively innovates solutions to drive growth while creating and implementing tools that help make everything else in the company possible.

Description

As our Site Reliability Engineer, you are part coder, part sleuth, part efficiency expert, and you're passionate about getting code into production that brings efficiency and quality gains to Nike Experiences. We are a maturing Service Reliability Engineering team and you will have the opportunity to work in all aspects of the Nike business while working alongside engineers, production support, and product owners. Your focus is to ensure their system is reliable, monitored, and support operations are automated. You will also collaborate with stakeholders to serve, observe, own, and solve problems through innovation, reducing friction with production deployments, and increasing availability. By working closely with the Tools and Release teams, you will make sure changes are continuously being built, tested, and deployed. This role fills the large gap between Application Engineers and Tech Ops staff and requires you to be an excellent communicator, listener, and explainer.
What we're looking for:
  • Track record of succinctly documenting processes, procedures, and best practices
  • Integration expert: able to wire systems together via their APIs
  • Fluent coder in Python, Java, Bash, C#, PowerShell or similar
  • Comfortable with Linux and Windows
  • Experience working on complex 24x7 available distributed systems
  • Familiarity with build tools, particularly Jenkins
  • Understanding and experience with Build Repositories, ideally Artifactory
  • Possession of a deep knowledge of developer workflows with Git (BitBucket)
  • Experience setting up monitoring using ScienceLogic, NewRelic or Solarwinds
  • Experience transforming Big Data into Operations Insights, ideally with Splunk
  • Comfortable leveraging AI and Machine Learning for Predictive Analysis of Failures and Correlation, ideally with Splunk
  • Experience migrating systems between technologies
  • Participation in on-call rotation
  • Runtime Infrastructure : Docker, Kubernetes, Lambdas
  • Storage Systems : DynamoDB, MongoDB, Redis, Hadoop, Memcached
  • Messaging : Kafka, Rsyslog, Logstash, Splunk
  • Programming/Scripting : Java / Jetty, Python, Scala, PowerShell, C#, Bash
  • Build Tools/Repositories : Jenkins, Artifactory
  • Web Services Framework: Django, Flask
  • API Framework: Gunicorn


Qualifications

  • 5+ years of relevant experience focused on site reliability, TechOps, DevOps, systems administration, application development, build, release and deployment
  • Experience deploying systems with Kubernets, Docker, or Azure Containers
  • Automation experience with ScienceLogic or Ansible
  • Some experience with monitoring and anomaly detection systems
  • Hands-on experience on monitoring tools such as New Relic, Splunk, SignalFx, Solarwinds, etc.
  • Well versed with ITIL concepts (Event, Incident, Knowledge, Change, Problem Management)
  • Familiarity with Chatbots such as Slackbot
  • End user knowledge on ITSM tools such as ServiceNow


Reference: 561333893

Similar Jobs

Engineer

Portland, OR

Nike

Software Engineer

Portland, OR

Nike

BUILDING ENGINEER

Portland, OR

ABM Industries