Lead High Performance Computing Engineer
Brown University
Providence, RI
ID: 7291210
Posted: 1 month ago
Application Deadline: Open Until Filled
Job Description
Job Description:
Lead High Performance Computing Engineer
Office of Information Technology
The Lead High Performance Computing Engineer is responsible for the HPC team that manages high-performance computing (HPC) cluster, storage, and networking infrastructure. This includes, but is not limited to the integration of HPC systems with Brown’s overarching IT infrastructure, deployment and management of parallel file systems, and maintaining system security. The Lead HPC Engineer is expected to occasionally debug and possibly rewrite low-level systems software, evaluate software for potential acquisition, and work closely with vendors’ support organizations to ensure timely resolution of problems and high availability of CCV production services. In addition, occasional user support will be performed to assist in system-level application debugging/optimization.
Qualifications:
Education
Bachelor’s degree preferred
Required Experience
3 - 5 years of experience with Linux systems administration
Lead or supervisory experience
Familiarity and experience with many of the following
Research computing environments
RHEL/CentOS Linux operating management experience
Problem resolution and troubleshooting skills; e.g. dump reading, traces, etc
Excellent interpersonal and communications skills (both oral and written)
Strong ability to multi-task and prioritize activities
Conceptual knowledge of all and in-depth knowledge of some of the following: systems architectures, security, networking, storage systems, parallel computing, batch/scheduling systems
Knowledge of programming languages such as C, C++, bash, Perl, etc.
Use of source control systems such as Git
Use of log correlation software such as Sumologic
Knowledge of large-scale research computing platforms such as Globus, HPC environments, SLURM, GPFS
Machine Learning Frameworks, including Tensorflow
Demonstrates a willingness and ability to support a diverse and inclusive environment.
Preferred Experience
GPFS, Lustre, or BeeGFS filesystem experience
basic knowledge of MPI programming and RDMA interconnects
Infiniband
SLURM
Lead High Performance Computing Engineer
Office of Information Technology
The Lead High Performance Computing Engineer is responsible for the HPC team that manages high-performance computing (HPC) cluster, storage, and networking infrastructure. This includes, but is not limited to the integration of HPC systems with Brown’s overarching IT infrastructure, deployment and management of parallel file systems, and maintaining system security. The Lead HPC Engineer is expected to occasionally debug and possibly rewrite low-level systems software, evaluate software for potential acquisition, and work closely with vendors’ support organizations to ensure timely resolution of problems and high availability of CCV production services. In addition, occasional user support will be performed to assist in system-level application debugging/optimization.
Qualifications:
Education
Bachelor’s degree preferred
Required Experience
3 - 5 years of experience with Linux systems administration
Lead or supervisory experience
Familiarity and experience with many of the following
Research computing environments
RHEL/CentOS Linux operating management experience
Problem resolution and troubleshooting skills; e.g. dump reading, traces, etc
Excellent interpersonal and communications skills (both oral and written)
Strong ability to multi-task and prioritize activities
Conceptual knowledge of all and in-depth knowledge of some of the following: systems architectures, security, networking, storage systems, parallel computing, batch/scheduling systems
Knowledge of programming languages such as C, C++, bash, Perl, etc.
Use of source control systems such as Git
Use of log correlation software such as Sumologic
Knowledge of large-scale research computing platforms such as Globus, HPC environments, SLURM, GPFS
Machine Learning Frameworks, including Tensorflow
Demonstrates a willingness and ability to support a diverse and inclusive environment.
Preferred Experience
GPFS, Lustre, or BeeGFS filesystem experience
basic knowledge of MPI programming and RDMA interconnects
Infiniband
SLURM