Invited Talks
Takayuki Aoki, Tokyo Tech
Beyond Peta-scale on Stencil and Particle-based GPU Applications
GPU (Graphics Processing Unit) has been widely used in high-performance computing as an accelerator with high computational performance and wide memory bandwidth. We have succeeded in several large-scale stencil and particle applications on a GPU supercomputer TSUBAME: a weather prediction model covering the whole Japan area with 500-m horizontal resolution, a turbulent airflow simulation with LES and Lattice Boltzmann Method for a central part of metropolitan Tokyo for 10 km x 10 km area with 1-m resolution and a phase-field simulation for the dendritic solidification of a binary alloy with 0.3 trillion cells. We also demonstrate granular DEM and fluid SPH simulations using billion particles which are quite different from N-body problems. Stencil framework approaches for practical applications have been also developed for high productivities and an extended roofline model using node performance and interconnection bandwidth instead of processor performance and bandwidth is used to discuss post peta-scale applications.
George Biros, UT Austin
Exascale N-body algorithms for data analysis and simulation
N-body algorithms are ubiquitous in science and engineering and form the core methods for many production codes in industrial, academic, and government labs. They find application in both computational physics and machine learning. Tree-based methods typically require irregular memory access patterns that result in reduced off- and on-node performance. Although significant progress has been made in improving off-node performance, on-node performance remains an open problem. This is especially true for production tree-based codes that have multi-stage computations involving data reshuffles and multiple computational kernels. This on-node utilization wall—a chronic problem since the early nineties—not only remains unresolved but has become much more acute with the emergence of deeper memory hierarchies and manycore and heterogeneous architectures. In this talk, I will outline the computational kernels used in N-body methods and I will describe the challenges in scaling them efficiently.
Craig Stewart, Indiana Univ
Exascale - on what dimension?
US President Obama recently signed an executive order creating the National Strategic Computing Initiative. This executive order sets out a goal of creating an exaFLOPS computational system for the US. One of the announcements about this initiative quoted the President's Council of Advisors on Science and Technology which has previously stated that high-performance computing "must now assume a broader meaning, encompassing not only flops, but also the ability, for example, to efficiently manipulate vast and rapidly increasing quantities of both numerical and non-numerical data." This talk will discuss US efforts toward exascale computation, particularly considering the synergies between developments needed to enable exascale tightly coupled parallel workloads and exascale big data computations.