مرکز منطقه ای اطلاع رساني علوم و فناوري - Communication-Optimal Parallel N-body Solvers

DocumentCode :

3001597

Title :

Communication-Optimal Parallel N-body Solvers

Author :

Chandramowlishwaran, Aparna ; Vuduc, Richard

Author_Institution :

Sch. of Comput. Sci. & Eng., Georgia Inst. of Technol., Atlanta, GA, USA

fYear :

2012

fDate :

21-25 May 2012

Firstpage :

2462

Lastpage :

2465

Abstract :

We present new analysis, algorithmic techniques, and implementations of the Fast Multipole Method (FMM) for solving N-body problems. Our research specifically addresses two key challenges. The first challenge is how to engineer fast code for today´s platforms. We present the first in-depth study of multicore optimizations and tuning for FMM, along with a systematic approach for transforming a conventionally parallelized FMM into a highly-tuned one. We introduce novel optimizations that significantly improve the within-node scalability of the FMM, thereby enabling high-performance in the face of multicore and many core systems. The second challenge is how to understand scalability on future systems. We present a new algorithmic complexity analysis of the FMM that considers both intra- and inter-node communication costs. This analysis yields the surprising prediction that although the FMM is largely compute-bound today, and therefore highly scalable on current systems, the trajectory of processor architecture designs-if there are no significant change-could cause it to become communication-bound as early as the year 2020. This prediction suggests the utility of our analysis approach, which directly relates algorithmic and architectural characteristics, for enabling a new kind of high-level algorithm-architecture co-design.

Keywords :

computational complexity; multiprocessing systems; parallel architectures; algorithmic characteristics; algorithmic complexity analysis; algorithmic technique; architectural characteristics; communication-bound; communication-optimal parallel N-body solver; fast code engineering; fast multipole method; high-level algorithm-architecture co-design; inter-node communication cost; intra-node communication cost; many core system; multicore optimization; multicore system; parallelized FMM; processor architecture design; within-node scalability; Algorithm design and analysis; Computational modeling; Multicore processing; Optimization; Predictive models; Scalability; Tuning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International

Conference_Location :

Shanghai

Print_ISBN :

978-1-4673-0974-5

Type :

conf

DOI :

10.1109/IPDPSW.2012.303

Filename :

6270869

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3001597