CoAdELL: Adaptivity and Compression for Improving Sparse Matrix-Vector Multiplication on GPUs

Author

Maggioni, Matteo ; Berger-Wolf, Tanya

Author_Institution

Dept. of Comput. Sci., Univ. of Illinois at Chicago, Chicago, IL, USA

fYear

2014

fDate

19-23 May 2014

Firstpage

933

Lastpage

940

Abstract

Numerous applications in science and engineering rely on sparse linear algebra. The efficiency of a fundamental kernel such as the Sparse Matrix-Vector multiplication (SpMV) is crucial for solving increasingly complex computational problems. However, the SpMV is notorious for its extremely low arithmetic intensity and irregular memory patterns, posing a challenge for optimization. Over the last few years, an extensive amount of literature has been devoted to implementing SpMV on Graphic Processing Units (GPUs), with the aim of exploiting the available fine-grain parallelism and memory bandwidth. In this paper, we propose to efficiently combine adaptivity and compression into an ELL-based sparse format in order to improve the state-of-the-art of the SpMV on Graphic Processing Units (GPUs). The foundation of our work is AdELL, an efficient sparse data structure based on the idea of distributing working threads to rows according to their computational load, creating balanced hardware-level blocks (warps) while coping with the irregular matrix structure. We designed a lightweight index compression scheme based on delta encoding and warp granularity that can be transparently embedded into AdELL, leading to an immediate performance benefit associated with the bandwidth-limited nature of the SpMV. The proposed integration provides a highly-optimized novel sparse matrix format known as Compressed Adaptive ELL (CoAdELL). We evaluated the effectiveness of our approach on a large set of benchmarks from heterogeneous application domains. The results show consistent improvements for double-precision SpMV calculations over the AdELL baseline. Moreover, we assessed the general relevance of CoAdELL with respect to other optimized GPU-based sparse matrix formats. We drew a direct comparison with clSpMV and BRO-HYB, obtaining sufficient experimental evidence (33% geometric average improvement over clSpMV and 43% over BRO-HYB) to propose our research work as the novel state-of-the-ar- .

Keywords

data compression; graphics processing units; matrix multiplication; sparse matrices; CoAdELL; ELL-based sparse format; GPU; balanced hardware-level blocks; complex computational problems; compressed adaptive ELL; double-precision SpMV calculations; efficient sparse data structure; graphic processing units; lightweight index compression scheme; sparse matrix-vector multiplication; Data structures; Graphics processing units; Indexes; Instruction sets; Kernel; Sparse matrices; ELL; GPU; SpMV; adaptive; compression; linear algebra; matrix format; optimization; sparse;

fLanguage

English

Publisher

ieee

Conference_Titel

Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International

Conference_Location

Phoenix, AZ

Print_ISBN

978-1-4799-4117-9

Type

conf

DOI

10.1109/IPDPSW.2014.106

Filename

6969482

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=167445