Thomas Sterling

Chief Scientist and Associate Director, CREST

Professor, School of Informatics and Computing
Indiana University
Bloomington, IN

Runtime System Architecture for Dynamic Adaptive Execution

Even the highest scale contemporary conventional HPC system architectures are optimized for the basic operations and access patterns of classical matrix and vector processing. These include emphasis on FPU utilization, high data reuse requiring temporal and spatial locality, uniform strides of indexing through regular data structures either dense or sparse. Such systems in the 100 Petaflops performance regime such as the Chinese Sunway Taihu-Light, and the US CORAL Summit and Aurora to be deployed in 2018 in spite of their innovations still are limited in these properties. Emerging classes of new application problems in data analytics, machine learning, and knowledge management demand very different operational properties in response to their highly irregular, sparse, and dynamic behaviors exhibiting little or no data reuse, random access patterns, and meta-data dominated processing. Close examination clearly suggests that at the core of these “big data” applications is dynamic adaptive graph processing which is in some ways diametrically opposite to conventional matrix computing. Of immediate importance is the need to significantly enhance efficiency and scalability as well as user productivity, performance portability, and reduced energy. Key to this is the introduction of powerful runtime system software for the exploitation of real-time system status information to support dynamic adaptive resource management and task scheduling. But software alone will be insufficient for extreme-scale where near fine-grained parallelism is necessary and software overheads will bound efficiency and scalability. A new era of architecture research is beginning in the combined domains of accelerator hardware for both graph processing and runtime systems. This presentation will discuss the nature of the computational challenges, examples and experiments with state of the art runtime system software HPX-5, and future directions in hardware architecture support for exascale runtime-assisted big data computation. Questions and comments from the audience will be welcome throughout the talk.