Posted 29 Oct 2023, 4:00 pm
HPC Principal Engineer at Chan Zuckerberg Biohub - San Francisco
The Chan Zuckerberg Biohub has an immediate opening for a High Performance Computing (HPC) Principal Engineer. The CZ Biohub is a one-of-a-kind independent non-profit research institute that brings together three leading universities - Stanford, UC Berkeley, and UC San Francisco - into a single collaborative technology and discovery engine. Along with the world-class engineering team at the Chan Zuckerberg Initiative, the CZ Biohub supports over 100 of the brightest, boldest engineers, data scientists, and biomedical researchers in the Bay Area, with the mission of understanding the underlying mechanisms of disease through the development of tools and technologies and the application to therapeutics and diagnostics.
This position will be tasked with strengthening and expanding the scientific computational capacity to further the Biohub’s expanding global scientific leadership. The HPC Principal Engineer will also provide IT capabilities and consulting support to science and technical programs. This position will work closely with many different science teams simultaneously to translate experimental descriptions into software and hardware requirements and across all phases of the scientific lifecycle, including data ingest, analysis, management and storage, computation, authentication, tool development and many other IT needs expressed by scientific projects.
This position reports to the Director for Scientific Computing and will be hired at a level commensurate with the skills, knowledge, and abilities of the successful candidate.
What You'll Do
- Work with a wide community of scientific disciplinary experts to identify emerging and essential information technology needs and translate those needs into information technology requirements
- Build an on-prem HPC infrastructure supplemented with cloud computing to support the expanding IT needs of the Biohub
- Support the efficiency and effectiveness of capabilities for data ingest, data analysis, data management, data storage, computation, identity management, and many other IT needs expressed by scientific projects
- Plan, organize, track and execute projects
- Foster cross-domain community and knowledge-sharing between science teams with similar IT challenges
- Research, evaluate and implement new technologies on a wide range of scientific compute, storage, networking, and data analytics capabilities
- Promote and assist researchers with the use of Cloud Compute Services (AWS, GCP primarily) containerization tools, etc. to scientific clients and research groups
- Work on problems of diverse scope where analysis of data requires evaluation of identifiable factors
- Assist in cost & schedule estimation for the IT needs of scientists, as part of supporting architecture development and scientific program execution
- Support Machine Learning capability growth at the CZ Biohub
- Provide scientist support in deployment and maintenance of developed tools
- Plan and execute all above responsibilities independently with minimal intervention
What You'll Bring
- Bachelor’s Degree in Biology or Life Sciences is preferred. Degrees in Computer Science, Mathematics, Systems Engineering or a related field or equivalent training/experience also acceptable. An advanced degree is strongly desired.
- A minimum of 8 years of experience designing and building web-based working projects using modern languages, tools, and frameworks
- Experience building on-prem HPC infrastructure and capacity planning
- Experience and expertise working on complex issues where analysis of situations or data requires an in-depth evaluation of variable factors
- Experience supporting scientific facilities, and prior knowledge of scientific user needs, program management, data management planning or lab-bench IT needs
- Experience with HPC and cloud computing environments
- Ability to interact with a variety of technical and scientific personnel with varied academic backgrounds
- Strong written and verbal communication skills to present and disseminate scientific software developments at group meetings
- Demonstrated ability to reason clearly about load, latency, bandwidth, performance, reliability, and cost and make sound engineering decisions balancing them
- Demonstrated ability to quickly and creatively to implement novel solutions and ideas
Technical experience includes -
- Proven ability to analyze, troubleshoot, and resolve complex problems that arise in the HPC production storage hardware, software systems, storage networks and systems
- Configuring and administering parallel, network attached storage (Lustre, NFS, ESS, Ceph) and storage subsystems (e.g. IBM, NetApp, DataDirect Network, LSI, etc.)
- Installing, configuring, and maintaining job management tools (such as SLURM, Moab, TORQUE, PBS, etc.)
Red Hat Enterprise Linux, CentOS, or derivatives and Linux services and technologies like dnsmasq, systemd, LDAP, PAM, sssd, OpenSSH, cgroups
- Scripting languages (including Bash, Python, or Perl)
- Virtualization (ESXi or KVM/libvirt), containerization (Docker or Singularity), configuration management and automation (tools like xCAT, Puppet, kickstart) and orchestration (Kubernetes, docker-compose, CloudFormation, Terraform.)
- High performance networking technologies (Ethernet and Infiniband) and hardware (Mellanox and Juniper)
- Configuring, installing, tuning and maintaining scientific application software
- Familiarity with source control tools (Git or SVN)
The Chan Zuckerberg Biohub requires all employees, contractors, and interns, regardless of work location or type of role, to provide proof of full COVID-19 vaccination, including a booster vaccine dose, if eligible, by their start date. Those who are unable to get vaccinated or obtain a booster dose because of a disability, or who choose not to be vaccinated due to a sincerely held religious belief, practice, or observance must have an approved exception prior to their start date.
- Principal Engineer = $212,000 - $291,500
New hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. To determine starting pay, we consider multiple job-related factors including a candidate’s skills, education and experience, market demand, business needs, and internal parity. We may also adjust this range in the future based on market data. Your recruiter can share more about the specific pay range during the hiring process.
Please mention the word **SMOOTHEST** and tag RMTA3LjE3OC4yMzguNDQ= when applying to show you read the job post completely (#RMTA3LjE3OC4yMzguNDQ=). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.
The offering company is responsible for the content on this page / the job offer.
Source: Remote Ok