Lead High Performance Computing Engineer

Brown University

Providence, RI

ID: 7291210
Posted: Recently posted
Application Deadline: Open Until Filled

Job Description

Job Description:

Lead High Performance Computing Engineer

Office of Information Technology

The Lead High Performance Computing Engineer is responsible for the HPC team that manages high-performance computing (HPC) cluster, storage, and networking infrastructure. This includes, but is not limited to the integration of HPC systems with Brown’s overarching IT infrastructure, deployment and management of parallel file systems, and maintaining system security. The Lead HPC Engineer is expected to occasionally debug and possibly rewrite low-level systems software, evaluate software for potential acquisition, and work closely with vendors’ support organizations to ensure timely resolution of problems and high availability of CCV production services. In addition, occasional user support will be performed to assist in system-level application debugging/optimization.

Qualifications:

Education

Bachelor’s degree preferred
Required Experience

3 - 5 years of experience with Linux systems administration
Lead or supervisory experience
Familiarity and experience with many of the following

Research computing environments

RHEL/CentOS Linux operating management experience

Problem resolution and troubleshooting skills; e.g. dump reading, traces, etc

Excellent interpersonal and communications skills (both oral and written)

Strong ability to multi-task and prioritize activities

Conceptual knowledge of all and in-depth knowledge of some of the following: systems architectures, security, networking, storage systems, parallel computing, batch/scheduling systems

Knowledge of programming languages such as C, C++, bash, Perl, etc.

Use of source control systems such as Git

Use of log correlation software such as Sumologic

Knowledge of large-scale research computing platforms such as Globus, HPC environments, SLURM, GPFS

Machine Learning Frameworks, including Tensorflow

Demonstrates a willingness and ability to support a diverse and inclusive environment.

Preferred Experience

GPFS, Lustre, or BeeGFS filesystem experience

basic knowledge of MPI programming and RDMA interconnects

Infiniband

SLURM