This Job Vacancy has Expired!

Site Reliability Engineer

Posted on Feb 4, 2021 by Request Technology

California, CA
IT
Immediate Start
$120k - $150k Annual
Full-Time


A prestigious company is on the search for a Software Engineer. This is a site reliability engineer and they need to be an expert and have a lot of experience with Linux administration, Puppet, Docker, and Python Scripting. They will be responsible for having extensive knowledge with infrastructure and application monitoring tools as well as be able to implement IaC conepts using Terraform, Chef, and Puppet. This is a full time position and can be worked remote.

Responsibilities:

  • Implement tools and processes necessary to achieve required SLOs for Company Platform.
  • Define and implement CI/CD pipelines.
  • Automate delivery of platform services using infrastructure-as-a-code. Build self-service playbooks for platform which can be consumed across globally distributed teams at Company.
  • Define and implement incident response management process, deploy necessary tools.
  • Fix support and escalation issues.
  • Conduct post-incident reviews.
  • Collaborate with application and business stakeholders to ensure high-quality product is developed and deployed in production. Work diligently with other engineering teams to ratify release processes necessary to meet business goals.
  • Drive continuous improvement process

Skills:

  • Expert knowledge of one of the major public cloud platforms (Azure, AWS, Google Cloud Platform)
  • Hands-on programming experience in Python or other object-oriented programming languages.
  • Expert knowledge of Infrastructure and Application Monitoring tools: Prometheus, Grafana, DataDog, etc
  • Experience implementing IaC concepts using Terraform, Chef, Puppet.
  • Experience with Elasticsearch, Kibana
  • Experience administering Databases
  • Expert in Linux administration.
  • Expert knowledge of Docker, Helm.
  • Experience implementing CI/CD for cloud native applications.
  • Experience with deploying applications that utilize Service Mesh
  • Experience administering Kubernetes clusters.
  • Experience defining and implementing incident response management processes.

Qualifications:

  • Bachelor's degree
  • 8+ years' experience in software engineering
  • Master's degree - preferred
  • Understanding of GitOps principals.
  • Experience implementing secure and compliant Kubernetes platforms.
  • Experience deploying and managing stateful distributed service in Kubernetes.
  • Experience with security scanning tools.
  • Experience with intrusion detection systems.
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Working knowledge of Databricks, Team Foundation Server, TeamCity, Octopus deploys and DataDog



Reference: 1082056313

Set up alerts to get notified of new vacancies.