An Optimized GP-GPU Warp Scheduling Algorithm for Sparse Matrix-Vector Multiplication

Author

Lifeng Liu ; Meilin Liu ; Chong-Jun Wang

Author_Institution

Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA

fYear

2013

fDate

17-19 July 2013

Firstpage

222

Lastpage

231

Abstract

GP-GPUs have been used as the platform for many applications due to their powerful computation ability and massively parallel features. In this paper, we first investigate the CSR sparse matrix format, the performance of existing optimized SpMV (Sparse matrix-vector multiplication) algorithms, and analyze the memory access patterns of the SpMV algorithms. Based on the analysis of the memory access patterns, we propose a new thread scheduling technique that can take advantage of inter-warp locality and intra-warp locality simultaneously, and also can achieve memory coalescing automatically. This proposed new scheduling technique will change the memory access pattern of SpMVs significantly. The simulation results show that the performance of the SpMV using the new proposed thread scheduling technique achieves much better performance than the implementation of the SpMV optimized by other techniques.

Keywords

graphics processing units; mathematics computing; matrix multiplication; multiprocessing systems; scheduling; vectors; CSR sparse matrix format; general purpose graphics processing unit; inter-warp locality; intra-warp locality; memory access patterns; memory coalescence; optimized GP-GPU warp scheduling algorithm; optimized SpMV algorithms; sparse matrix-vector multiplication; thread scheduling technique; Algorithm design and analysis; Arrays; Graphics processing units; Instruction sets; Sparse matrices; Vectors; GPU; SpMV; data locality; manycore; multicore; scheduling;

fLanguage

English

Publisher

ieee

Conference_Titel

Networking, Architecture and Storage (NAS), 2013 IEEE Eighth International Conference on

Conference_Location

Xi´an

Type

conf

DOI

10.1109/NAS.2013.35

Filename

6665367