Title :
Locality Aware DAG-Scheduling for LU-Decomposition
Author :
Maier, Tobias ; Sanders, Peter ; Speck, Jochen
Author_Institution :
Inst. for Theor. Inf., Karlsruhe Inst. of Technol., Karlsruhe, Germany
Abstract :
Modern computers have deepening memory hierarchies with multiple levels of (partially shared) caches and non-uniform memory access (NUMA). This makes it increasingly difficult and important to schedule computations in such a way that expensive memory accesses are avoided.In this paper we are choosing LU-decomposition for a case study since its use in the famous LINPACK benchmark means that highly tuned codes are already available. Our approach is to perform the very same computations as a leading implementation (PLASMA) but to schedule them in a more locality aware way. In particular, we better take into account when independent subtasks share the same input data and we explicitly address NUMA-effects coordinating memory layout and task scheduling. These measures lead to up to 36 % performance improvement compared to PLASMA.
Keywords :
cache storage; processor scheduling; storage management; LINPACK benchmark; LU-decomposition; NUMA-effects coordinating memory layout; PLASMA; caches; locality aware DAG-scheduling; memory hierarchies; modern computers; nonuniform memory access; task scheduling; Informatics; Layout; Matrix decomposition; Plasmas; Processor scheduling; Program processors; Schedules; Cache memories; Numerical Linear Algebra; Scheduling and task partitioning;
Conference_Titel :
Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
Conference_Location :
Hyderabad
DOI :
10.1109/IPDPS.2015.85