Title :
An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication
Author :
Karakasis, Vasileios ; Gkountouvas, T. ; Kourtis, Kornilios ; Goumas, Georgios ; Koziris, Nectarios
Author_Institution :
Comput. Syst. Lab., Nat. Tech. Univ. of Athens (NTUA), Zografou, Greece
Abstract :
Sparse matrix-vector multiplication (SpM × V) has been characterized as one of the most significant computational scientific kernels. The key algorithmic characteristic of the SpM × V kernel, that inhibits it from achieving high performance, is its very low flop:byte ratio. In this paper, we present a compressed storage format, called Compressed Sparse eXtended (CSX), that is able to detect and encode simultaneously multiple commonly encountered substructures inside a sparse matrix. Relying on aggressive compression techniques of the sparse matrix´s indexing structure, CSX is able to considerably reduce the memory footprint of a sparse matrix, alleviating the pressure to the memory subsystem. In a diverse set of sparse matrices, CSX was able to provide a more than 40 percent average performance improvement over the standard CSR format in SMP architectures and surpassed 20 percent improvement in NUMA systems, significantly outperforming other CSR alternatives. Additionally, it was able to adapt successfully to the nonzero element structure of the considered matrices, exhibiting very stable performance. Finally, in the context of a “real-life” multiphysics simulation software, CSX accelerated the SpM × V component nearly 40 percent and the total solver time approximately 15 percent.
Keywords :
data compression; indexing; information retrieval; matrix multiplication; optimisation; sparse matrices; storage management; CSX; NUMA systems; SMP architecture; aggressive compression technique; compressed sparse extended; extended storage compression format; memory subsystem; multiphysics simulation software; nonzero element structure; optimization; sparse matrix indexing structure; sparse matrix vector multiplication; standard CSR format; substructure detection; substructure encoding; Computer architecture; Encoding; Indexes; Kernel; Optimization; Sparse matrices; Vectors; Sparse Matrix-Vector Multiplication; data compression; multicore optimizations;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2012.290