Dear All,
I would like to share with you that we have an opening for a System
Administrator at the National Heart Lung and Blood Institute, NIH
(USA). Please forward this along to anyone who may have interest!
The Laboratory of Computational Biophysics (LCB) is a group of
researchers who employ computational simulation methods to investigate
problems in biophysics and chemistry using the Linux-based LoBoS
high-performance computing (HPC) cluster within the National Heart,
Lung, and Blood Institute (NHLBI) at the National Institutes of Health
(
https://www.lobos.nih.gov/LoBoS.shtml). LoBoS consists of several
hundred CPU/GPU computational nodes, three tiers of storage (home
directories, scratch space, and archive), associated network
infrastructure (both Infiniband and Ethernet), and Linux desktops for
users.
This position is for the day-to-day management of the LoBoS HPC
compute nodes, storage systems, and desktops. The position involves
working as part of a small team (at least two people) whose primary
responsibilities are to keep the cluster running in good order and
ensuring the cluster follows security best-practices as determined by
the NIH and Department of Health and Human Services. It also involves
maintaining the usability of the LoBoS cluster via yearly purchase and
installation of hardware to replace aging components.
ABOUT THE POSITION
• Oversee that various components of the LoBoS cluster stay in good
working order such as network configuration, firewall management (Palo
Alto), file system management (ZFS, VAST), security, batch queuing
systems (SLURM), database administration, distributed computing, file
transfer services, web servers, and electronic mailing lists.
• May occasionally require work outside normal 9-5 hours in order to
address emergency situations with the cluster (e.g. significant
numbers of down nodes, storage outages, etc.) or cybersecurity
incidents (FISMA).
• Ensure that the LoBoS cluster has sufficient capabilities to run the
scientific software needed by the LCB scientists. Evaluates the
existing system to determine when updates/upgrades to hardware and/or
software are necessary. Responsible for managing the budget used to
procure new hardware/software for LoBoS. Oversee configuration and
installation of virtual and physical servers and manage upgrades to
existing hardware.
• Ensure that patches, security updates, and configuration changes to
software systems are applied to enhance reliability and to meet
security needs. Collaborate with OCIO, CIT, and NHLBI security teams
to ensure adherence to compliance policies.
• Assist in maintaining the LoBoS Assessment & Authorization package
based on National Institute of Standards and Technology SP 800-53
security controls under guidance from NHLBI's Information System
Security Officers.
• Serve as a technical resource for HPC, LCB, NHLBI, and other NIH
personnel in areas such as the Linux operating system, networking,
database system administration, distributed computing. May serve on
technical evaluation panels for institute-wide initiatives.
• Technology tracking: Stay informed regarding new developments in
hardware/software, and evaluate their potential usability for
LoBoS/LCB. Participates in conferences and meetings of professional
groups concerned with the application of HPC, AI/machine learning, and
other emerging computer technologies.
• Prepare software documentation and technical reports related to
assigned projects.
ABOUT YOUR BACKGROUND
• 5+ years of experience in Linux HPC systems administration is
preferred. However, less experienced candidates with outstanding
qualifications will also be considered.
• Comprehensive knowledge of shell scripting. Broad knowledge of
systems administration tools (e.g. Puppet, Ansible, etc.) along with a
detailed knowledge of tools used in a particular area such as file
system management, usage accounting, mail configuration, database
system administration, file transfer, or security.
• Experience with government computer security rules and standards is desirable.
• Extensive knowledge of at least two high level computer languages
such as C, C++, FORTRAN, Ruby, Perl, or Python is desirable.
• Experience implementing and managing SLURM batch queueing software preferred.
• Solid interpersonal, leadership, and critical thinking skills.
• Excellent written and oral communication skills.
ADDITIONAL INFORMATION
• Location: 9000 Rockville Pike, Bethesda, Maryland, which is
accessible via bus/bicycle/Metro (Red Line: Medical Center).
• Some travel to professional meetings (e.g. Super Computing
Conference) may occasionally be required.
• Some remote work is acceptable (up to 3 days per week).
• Employment type: full-time government contractor.
• Salary range: From $100,000 to $180,000/year, which will be
commensurate with education and experience.
• A selection of health and wellness benefits will be offered.
HOW TO APPLY
To be considered, please submit your resume and cover letter to Dr.
Daniel R. Roe at daniel.roe.nih.gov with the subject heading of System
Administrator. Appointees must be U.S. citizens, or Permanent Resident
Card holders. Applications should be submitted by November 4, 2024.
We are an equal opportunity employer, and we actively prohibit
discrimination and harassment of any kind. We strongly encourage
people of color, LGBTQ+ people, immigrants, women, and people who are
differently-abled to apply.
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Oct 10 2024 - 06:30:02 PDT