A memory transaction model for Sparse Matrix-Vector multiplications on GPUs

Author

Keklikian, Thalie ; Pierre Langlois, J.M. ; Savaria, Yvon

Author_Institution

Groupe de Rech. en Microelectron. et Microsystemes, Polytech. Montreal, Montréal, QC, Canada

fYear

2014

fDate

22-25 June 2014

Firstpage

309

Lastpage

312

Abstract

The Sparse Matrix-Vector multiplication (SpMV) is an algorithm used in many fields. Since the introduction of CUDA and general purpose programming on GPUs, several efforts to optimize it have been reported. SpMV optimization is complex due to irregular memory accesses depending on the nonzero element distribution of the matrix. In this paper, we propose a model that predicts the number of memory transactions of SpMV for a matrix stored in the CSR format. With the number of memory transactions known in advance, the performance and the execution time can be estimated. The model can be used to select the best suited CUDA implementation for sparse matrices for a given application domain. Predicted results from the model are within 7.5% for the matrices of more than 1000 rows that we have tested on the NVIDIA Tesla K20c and Ge-Force GTX 670.

Keywords

graphics processing units; matrix multiplication; storage management; vectors; CSR format; CUDA; GPU; NVIDIA Ge-Force GTX 670; NVIDIA Tesla K20c; SpMV algorithm; compute unified device architecture; general purpose programming; graphics processing unit; memory transaction model; nonzero element distribution; sparse matrix-vector multiplications; Graphics processing units; Instruction sets; Kernel; Optimization; Predictive models; Sparse matrices; Vectors; CUDA; GPU; SpMV;

fLanguage

English

Publisher

ieee

Conference_Titel

New Circuits and Systems Conference (NEWCAS), 2014 IEEE 12th International

Conference_Location

Trois-Rivieres, QC

Type

conf

DOI

10.1109/NEWCAS.2014.6934044

Filename

6934044