GPU-Accelerated Large-Scale Electronic Structure Theory on Titan with a First-Principles All-Electron Code

ORAL

Abstract

Density-functional theory has been well established as the dominant quantum-mechanical computational method in the materials community. Large accurate simulations become very challenging on small to mid-scale computers and require high-performance compute platforms to succeed. GPU acceleration is one promising approach. In this talk, we present a first implementation of all-electron density-functional theory in the FHI-aims code for massively parallel GPU-based platforms. Special attention is paid to the update of the density and to the integration of the Hamiltonian and overlap matrices, realized in a domain decomposition scheme on non-uniform grids. The initial implementation scales well across nodes on ORNL's Titan Cray XK7 supercomputer (8 to 64 nodes, 16 MPI ranks/node) and shows an overall speed up in runtime due to utilization of the K20X Tesla GPUs on each Titan node of 1.4x, with the charge density update showing a speed up of 2x. Further acceleration opportunities will be discussed.

*Work supported by the LDRD Program of ORNL managed by UT-Battle, LLC, for the U.S. DOE and by the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.

Authors

  • William Paul Huhn

    • MEMS Department, Duke University
    • Duke University
  • Bj\"{o}rn Lange

    • MEMS Department, Duke University
  • Victor Yu

    • MEMS Department, Duke University
  • Volker Blum

    • MEMS Department, Duke University
    • Duke University
  • Seyong Lee

    • Computer Science and Math Division, Oak Ridge National Laboratory
  • Mina Yoon

    • Oak Ridge National Laboratory
    • Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
    • Center for Nanophase Materials Sciences, Oak Ridge National Laboratory