Particle in Cell Algorithms and Codes Toward the Next Generation Architectures
ORAL
Abstract
Massively parallel and heterogeneous next-generation platforms present unprecedented challenges for maximizing efficiency of plasma simulation kernels. Asynchronous many-task (AMT) frameworks use deferred execution and asynchronous tasking to enable runtime capabilities ranging from load balance to data reuse to communication overlap. Many AMT systems are large full stack frameworks with certain programmability or stability concerns, limiting their use in Sandia production codes. We have developed DARMA (Distributed Asynchronous Resilient Models for Application), which instead provides a light-weight translation layer for embedding tasking and deferred execution in C$++$ codes that encourages, rather than restricts, flexible and diverse AMT software stacks. In this work, we focus on leveraging DARMA to better express asynchrony and load imbalance present in particle in cell kernels. In particular, we present empirical performance and productivity results, where DARMA implementation of these kernels is compared to more traditional implementations.
–