Title :
Low-overhead load-balanced scheduling for sparse tensor computations
Author :
Baskaran, Muthu ; Meister, Benoit ; Lethin, Richard
Author_Institution :
Reservoir Labs. Inc., New York, NY, USA
Abstract :
Irregular computations over large-scale sparse data are prevalent in critical data applications and they have significant room for improvement on modern computer systems from the aspects of parallelism and data locality. We introduce new techniques to efficiently map large irregular computations onto modern multi-core systems with non-uniform memory access (NUMA) behavior. Our techniques are broadly applicable for irregular computations with multi-dimensional sparse arrays (or sparse tensors). We implement a static-cum-dynamic task scheduling scheme with low overhead for effective parallelization of sparse computations. We introduce locality-aware optimizations to the task scheduling mechanism that are driven by the sparse input data pattern. We evaluate our techniques using two popular sparse tensor decomposition methods that have wide applications in data mining, graph analysis, signal processing, and elsewhere. Our techniques not only improve parallel performance but also result in improved performance scalability with increasing number of cores. We achieve around 4-5× improvement in performance over existing parallel approaches and observe “scalable” parallel performance on modern multi-core systems with up to 32 processor cores. We take real sparse data sets as input to the sparse tensor computations and demonstrate the achieved improvements.
Keywords :
multiprocessing systems; optimisation; parallel processing; processor scheduling; tensors; NUMA behavior; data locality; large-scale sparse data; locality-aware optimization; low-overhead load-balanced scheduling; multicore system; multidimensional sparse array; nonuniform memory access; parallelism; sparse tensor computation; sparse tensor decomposition; static-cum-dynamic task scheduling; task scheduling mechanism; Dynamic scheduling; Optimization; Processor scheduling; Runtime; Synchronization; Tensile stress;
Conference_Titel :
High Performance Extreme Computing Conference (HPEC), 2014 IEEE
Conference_Location :
Waltham, MA
Print_ISBN :
978-1-4799-6232-7
DOI :
10.1109/HPEC.2014.7041006