Senior HPC Infrastructure Engineer

Posted on Sep 25, 2024 by Guardant Health
Palo Alto, CA
Engineering
Immediate Start
Annual Salary
Full-Time
Job Description

Guardant’s HPC team builds and operates the computational technology backbone of the company. 

This includes scalable data storage that holds PBs of genomics data, high performance compute clusters running a custom bioinformatics pipeline in production and R&D environments, and the software infrastructure that hosts an ecosystem of services for internal data processing and external data integration. To facilitate Guardant Health’s fast growth in the next few years, the HPC team is looking for a strong technical engineer who can help maintain and help grow the HPC infrastructure during its aggressive expansion, while working with corporate IT, SQA and DevOps/SRE teams. 

This role can be remotely worked part-time, but requires a very hands on, on-premise presence when on rotation, minimally.

 In this role, you will primarily:

Assist in managing the HPC interconnect

Assist in integrating the HPC systems with the bandwidth on-demand system

Work with the networking infrastructure team to manage and optimize the connectivity to and from the HPC systems and locales

Help manage multiple HPC clusters and cluster file systems. 

Help research, develop and implement the next generation HPC solution

Troubleshoot the production system stack down to source code level e.g. shell scripts, python and others.

Maintain, monitor, and support the infrastructure environment and/or facilities.

Use and maintain enhanced production monitoring and additional capability.

Support improvements for increased system reliability and performance.

Support multiple systems or applications of medium to high complex (complexity defined by size, technology used, and system feeds and interfaces) with multiple concurrent users, ensuring control, integrity, and accessibility.

Support systems at remote locations, including internationally

Work with offsite consultants to maintain the infrastructure

Work with vendors to troubleshoot, upgrade and repair systems as needed

Participate in a 24/7 on-call rotation

Reference: 202389598

https://jobs.careeraddict.com/post/95561967

Senior HPC Infrastructure Engineer

Posted on Sep 25, 2024 by Guardant Health

Palo Alto, CA
Engineering
Immediate Start
Annual Salary
Full-Time
Job Description

Guardant’s HPC team builds and operates the computational technology backbone of the company. 

This includes scalable data storage that holds PBs of genomics data, high performance compute clusters running a custom bioinformatics pipeline in production and R&D environments, and the software infrastructure that hosts an ecosystem of services for internal data processing and external data integration. To facilitate Guardant Health’s fast growth in the next few years, the HPC team is looking for a strong technical engineer who can help maintain and help grow the HPC infrastructure during its aggressive expansion, while working with corporate IT, SQA and DevOps/SRE teams. 

This role can be remotely worked part-time, but requires a very hands on, on-premise presence when on rotation, minimally.

 In this role, you will primarily:

Assist in managing the HPC interconnect

Assist in integrating the HPC systems with the bandwidth on-demand system

Work with the networking infrastructure team to manage and optimize the connectivity to and from the HPC systems and locales

Help manage multiple HPC clusters and cluster file systems. 

Help research, develop and implement the next generation HPC solution

Troubleshoot the production system stack down to source code level e.g. shell scripts, python and others.

Maintain, monitor, and support the infrastructure environment and/or facilities.

Use and maintain enhanced production monitoring and additional capability.

Support improvements for increased system reliability and performance.

Support multiple systems or applications of medium to high complex (complexity defined by size, technology used, and system feeds and interfaces) with multiple concurrent users, ensuring control, integrity, and accessibility.

Support systems at remote locations, including internationally

Work with offsite consultants to maintain the infrastructure

Work with vendors to troubleshoot, upgrade and repair systems as needed

Participate in a 24/7 on-call rotation

Reference: 202389598

Share this job:
CareerAddict

Alert me to jobs like this:

Amplify your job search:

CV/résumé help

Increase interview chances with our downloads and specialist services.

CV Help

Expert career advice

Increase interview chances with our downloads and specialist services.

Visit Blog

Job compatibility

Increase interview chances with our downloads and specialist services.

Start Test