QMCPACK performance portability on NVIDIA and AMD GPUs

ORAL

Abstract

As Exascale supercomputers are being deployed in U.S., QMCPACK (https://qmcpack.org) developers have migrated the code base to a performance portable implementation for science production on these powerful machines with out-of-box experience. With a fresh design of code architecture, historically divergent code paths for CPUs and GPUs have been unified and a core set of features are available on all the computing platforms including CPUs and GPUs today. With portable OpenMP target offload programming model and high quality vendor linear algebra libraries, impressive performance has been achieved with minimal vendor specific customization needed. We show current performance for materials calculations on NVIDIA and AMD GPUs with a broad range of electron counts and analyze the remaining inefficiencies.

*This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.

Presenters

  • Ye Luo

    • Argonne National Laboratory

Authors

  • Ye Luo

    • Argonne National Laboratory
  • Peter Doak

    • Oak Ridge National Laboratory
  • Paul Kent

    • Oak Ridge National Lab
    • Oak Ridge National Laboratory