DocumentCode :
1998169
Title :
Scheduling a Parallel Sparse Direct Solver to Multiple GPUs
Author :
Kyungjoo Kim ; Eijkhout, Victor
Author_Institution :
Dept. of Aerosp. Eng. & Eng. Mech., Univ. of Texas at Austin, Austin, TX, USA
fYear :
2013
fDate :
20-24 May 2013
Firstpage :
1401
Lastpage :
1408
Abstract :
We present a sparse direct solver using multi-level task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense sub problems (fronts) related in an assembly tree. Critical to high performance of the solver is dynamic task allocation to account for the asymmetric performance of heterogeneous devices. Device-specific tasks are generated and adapted to different devices on the course of multifrontal factorization using multi-level matrix partitioning. Large blocks are used to provide coarse grain tasks for fast devices, and some of the blocks are recursively partitioned to supply fine-grained tasks for the next available (slower) devices. Experimental results are obtained from particular problems arising from a high order Finite Element Method.
Keywords :
finite element analysis; graphics processing units; matrix decomposition; multiprocessing systems; parallel processing; processor scheduling; GPU accelerators; assembly tree; asymmetric heterogeneous device performance; device-specific tasks; dynamic task allocation; fine-grained tasks; finite element method; multicore host processor; multifrontal factorization; multifrontal method; multilevel matrix partitioning; multilevel task scheduling; parallel sparse direct solver scheduling; Graphics processing units; Libraries; Multicore processing; Partitioning algorithms; Performance evaluation; Processor scheduling; Sparse matrices; Algorithms-by-blocks; Bulk-synchronous model; Heterogeneous architectures; MultiGPU; Multicore; Multifrontal factorization; hp-FEM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
Type :
conf
DOI :
10.1109/IPDPSW.2013.26
Filename :
6651033
Link To Document :
بازگشت