Application Experiences on a GPU-Accelerated Arm-based HPC Testbed


Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

Elwasif, W.; Godoy, W.; Hagerty, N.; Harris, J. A.; Hernandez, O.; Joo, B.; Kent, P.; Lebrun-Grandie, D.; Maccarthy, E.; Melesse Vergara, V. G.; Messer, B.; Miller, R.; Oral, S.; Bastrakov, S.; Bussmann, M.; Debus, A.; Steiniger, K.; Stephan, J.; Widera, R.; Bryngelson, S. H.; Le Berre, H.; Radhakrishnan, A.; Young, J.; Chandrasekaran, S.; Ciorba, F.; Simsek, O.; Clark, K.; Spiga, F.; Hammond, J.; Stone, J. E.; Hardy, D.; Keller, S.; Piccinali, J.-G.; Trott, C.

Abstract

This paper assesses and reports the experience of ten teams working to port,validate, and benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The testbed consists of eight NVIDIA Arm HPC Developer Kit systems built by GIGABYTE, each one equipped with a server-class Arm CPU from Ampere Computing and A100 data center GPU from NVIDIA Corp. The systems are connected together using Infiniband high-bandwidth low-latency interconnect. The selected applications and mini-apps are written using several programming languages and use multiple accelerator-based programming models for GPUs such as CUDA, OpenACC, and OpenMP offloading. Working on application porting requires a robust and easy-to-access programming environment, including a variety of compilers and optimized scientific libraries. The goal of this work is to evaluate platform readiness and assess the effort required from developers to deploy well-established scientific workloads on current and future generation Arm-based GPU-accelerated HPC systems. The reported case studies demonstrate that the current level of maturity and diversity of software and tools is already adequate for large-scale production deployments.

Keywords: ARM; HPC; NVIDIA; GPU; CUDA; OpenACC; OpenMP; alpaka; PIConGPU

Downloads

Permalink: https://www.hzdr.de/publications/Publ-36063