DocumentCode
580120
Title
Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems
Author
Fengguang Song ; YarKhan, Asim ; Dongarra, Jack
Author_Institution
EECS Dept., Univ. of Tennessee, Knoxville, TN, USA
fYear
2009
fDate
14-20 Nov. 2009
Firstpage
1
Lastpage
11
Abstract
This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a task-based library to replace the existing linear algebra subroutines such as PBLAS to transparently provide the same interface and computational function as the ScaLAPACK library. Linear algebra programs are written with the task-based library and executed by a dynamic runtime system. We mainly focus our runtime system design on the metric of performance scalability. We propose a distributed algorithm to solve data dependences without process cooperation. We have implemented the runtime system and applied it to three linear algebra algorithms: Cholesky, LU, and QR factorizations. Our experiments on both shared-memory machines (16, 32 cores) and distributed-memory machines (1024 cores) demonstrate that our runtime system is able to achieve good scalability. Furthermore, we provide analytical analysis to show why the tiled algorithms are scalable and the expected execution time.
Keywords
distributed algorithms; distributed memory systems; linear algebra; mathematics computing; matrix decomposition; processor scheduling; shared memory systems; software libraries; Cholesky factorization; LU factorization; PBLAS; QR factorization; ScaLAPACK library; analytical analysis; computational function; data dependency; dense linear algebra algorithms; distributed algorithm; distributed-memory machines; distributed-memory multicore systems; distributed-memory system; dynamic runtime system; dynamic task scheduling; linear algebra programs; linear algebra subroutines; performance scalability; process cooperation; runtime system design; shared-memory machines; shared-memory system; task-based library; tiled algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on
Conference_Location
Portland, OR
Type
conf
DOI
10.1145/1654059.1654079
Filename
6375569
Link To Document