System Engineer (HPC) - #1133360
Fujitsu
Fujitsu is seeking an experienced HPC Systems Engineer (or Senior HPC Systems Engineer, depending on experience) to support and operate large-scale Linux-based high-performance computing (HPC), storage, and networking environments. This role supports research scientists, academic users, and enterprise workloads, ensuring reliable, secure, and high-performance HPC operations.
Key Responsibilities: -
HPC Systems Operations: Administer, operate, and maintain Linux-based HPC clusters, including compute, storage, and high-speed networking
Manage and support:
HPC job schedulers (e.g. Slurm, PBS Pro, LSF)
Parallel file systems (Lustre, GPFS/Spectrum Scale, BeeGFS)
Data Lake solution eg: VAST
Hierarchical Storage Management (HSM) eg: Data Management Framework (DMF)
Cluster management and provisioning tools
Perform system monitoring, patching, upgrades, and capacity planning.
Troubleshooting and resolve hardware, software, OS, and network issues across HPC environments
Participate in on-call or escalation support rotations as needed
Work with our software engineer to support our AI/DL applications and our desktop engineer to help with user problems as required.
Advice and guidance to researchers for HPC application development, debugging, optimization and parallelization
Deliver HPC user training sessions and contribute to documentation and best-practice guides
Job Requirements:
Bachelor’s degree in computer science, Engineering, or a related field
Preferably with at least 5 years’ experience with large-scale HPC systems
Strong hands-on experience with:
Linux operating systems (RHEL, Rocky, SUSE)
HPC schedulers and resource managers
Parallel file systems
Proficient in management and administration of dHCI (disaggregated Hyper-Converged Infrastructure) platforms
Advanced scripting skills in Bash and Python; working knowledge of R for data analysis, with proficiency in other languages considered an asset.
Hands-on experience with OpenLDAP administration: user/group management, LDAP queries, security hardening, and performance tuning.Understanding of HPC performance tuning and optimization techniques.
In-depth understanding of complex IT solutions, including system architecture, integration patterns, and scalability best practices.
Exposure to the following will be of added advantage:
HPC code optimization and parallelization
Language and Library: Fortran, Open MP, MPI, C, C++
Knowledge of numerical simulation application such as climate research, weather forecasting and aeronautics simulation
*Only shortlisted candidates will be notified.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Production Engineer (Perm, West)
Recruitment Consultant
Junior Medical Technologist [Neurology Lab/Mon - Fri Office Hours] #HAL