DocumentCode :
3145659
Title :
Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA
Author :
Bosilca, George ; Bouteiller, Aurelien ; Danalis, Anthony ; Faverge, Mathieu ; Haidar, Azzam ; Herault, Thomas ; Kurzak, Jakub ; Langou, Julien ; Lemarinier, Pierre ; Ltaief, Hatem ; Luszczek, Piotr ; YarKhan, Asim ; Dongarra, Jack
Author_Institution :
Innovative Comput. Lab., Univ. of Tennessee, Knoxville, TN, USA
fYear :
2011
fDate :
16-20 May 2011
Firstpage :
1432
Lastpage :
1441
Abstract :
We present a method for developing dense linear algebra algorithms that seamlessly scales to thousands of cores. It can be done with our project called DPLASMA (Distributed PLASMA) that uses a novel generic distributed Direct Acyclic Graph Engine (DAGuE). The engine has been designed for high performance computing and thus it enables scaling of tile algorithms, originating in PLASMA, on large distributed memory systems. The underlying DAGuE framework has many appealing features when considering distributed-memory platforms with heterogeneous multicore nodes: DAG representation that is independent of the problem-size, automatic extraction of the communication from the dependencies, overlapping of communication and computation, task prioritization, and architecture-aware scheduling and management of tasks. The originality of this engine lies in its capacity to translate a sequential code with nested-loops into a concise and synthetic format which can then be interpreted and executed in a distributed environment. We present three common dense linear algebra algorithms from PLASMA (Parallel Linear Algebra for Scalable Multi-core Architectures), namely: Cholesky, LU, and QR factorizations, to investigate their data driven expression and execution in a distributed system. We demonstrate through experimental results on the Cray XT5 Kraken system that our DAG-based approach has the potential to achieve sizable fraction of peak performance which is characteristic of the state-of-the-art distributed numerical software on current and emerging architectures.
Keywords :
directed graphs; matrix decomposition; multiprocessing systems; parallel architectures; Cholesky factorization; Cray XT5 Kraken system; DAGuE framework; DPLASMA; LU factorization; QR factorization; architecture-aware scheduling; dense linear algebra algorithms; direct acyclic graph engine; distributed PLASMA; distributed memory systems; high performance computing; massively parallel architectures; multicore nodes; nested loops; parallel linear algebra; scalable multicore architectures; sequential code translation; task management; task prioritization; Engines; Heuristic algorithms; Linear algebra; Multicore processing; Plasmas; Tiles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on
Conference_Location :
Shanghai
ISSN :
1530-2075
Print_ISBN :
978-1-61284-425-1
Electronic_ISBN :
1530-2075
Type :
conf
DOI :
10.1109/IPDPS.2011.299
Filename :
6008998
Link To Document :
بازگشت