DocumentCode :
3001597
Title :
Communication-Optimal Parallel N-body Solvers
Author :
Chandramowlishwaran, Aparna ; Vuduc, Richard
Author_Institution :
Sch. of Comput. Sci. & Eng., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
2462
Lastpage :
2465
Abstract :
We present new analysis, algorithmic techniques, and implementations of the Fast Multipole Method (FMM) for solving N-body problems. Our research specifically addresses two key challenges. The first challenge is how to engineer fast code for today´s platforms. We present the first in-depth study of multicore optimizations and tuning for FMM, along with a systematic approach for transforming a conventionally parallelized FMM into a highly-tuned one. We introduce novel optimizations that significantly improve the within-node scalability of the FMM, thereby enabling high-performance in the face of multicore and many core systems. The second challenge is how to understand scalability on future systems. We present a new algorithmic complexity analysis of the FMM that considers both intra- and inter-node communication costs. This analysis yields the surprising prediction that although the FMM is largely compute-bound today, and therefore highly scalable on current systems, the trajectory of processor architecture designs-if there are no significant change-could cause it to become communication-bound as early as the year 2020. This prediction suggests the utility of our analysis approach, which directly relates algorithmic and architectural characteristics, for enabling a new kind of high-level algorithm-architecture co-design.
Keywords :
computational complexity; multiprocessing systems; parallel architectures; algorithmic characteristics; algorithmic complexity analysis; algorithmic technique; architectural characteristics; communication-bound; communication-optimal parallel N-body solver; fast code engineering; fast multipole method; high-level algorithm-architecture co-design; inter-node communication cost; intra-node communication cost; many core system; multicore optimization; multicore system; parallelized FMM; processor architecture design; within-node scalability; Algorithm design and analysis; Computational modeling; Multicore processing; Optimization; Predictive models; Scalability; Tuning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
Type :
conf
DOI :
10.1109/IPDPSW.2012.303
Filename :
6270869
Link To Document :
بازگشت