• DocumentCode
    1827174
  • Title

    An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units

  • Author

    Abu-Sufah, Walid ; Karim, A.A.

  • Author_Institution
    Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
  • fYear
    2012
  • fDate
    25-27 June 2012
  • Firstpage
    453
  • Lastpage
    460
  • Abstract
    Sparse matrix vector multiplication, SpMV, is often a performance bottleneck in iterative solvers. Recently, Graphics Processing Units, GPUs, have been deployed to enhance the performance of this operation. We present a blocked version of the Transposed Jagged Diagonal storage format which is tailored for GPUs, BTJAD. We develop a highly optimized SpMV kernel that takes advantage of the properties of the BTJAD storage format and reuses loaded values of the source vector in the registers of a GPU. Using 62 matrices with different sparsity patterns and executing on an NVIDIA Tesla T10 GPU, we compare the performance of our kernel with that of the SpMV kernels in NVIDIA´s library. Our kernel achieves superior execution throughputs for matrices that are non-uniform in their nonzero row lengths, outperforming the best available kernels by up to 4.67x. When executing on the Fermi class GeForce GTX480 GPU which has a larger register file size, the maximum speedup achieved by our kernel improves to 6.6x.
  • Keywords
    graphics processing units; iterative methods; matrix multiplication; optimisation; parallel architectures; sparse matrices; BTJAD storage format; Fermi class GeForce GTX480 GPU; NVIDIA Tesla T10; SpMV kernel; graphics processing unit; iterative solver; optimization; sparse matrix vector multiplication; sparsity pattern; transposed jagged diagonal storage format; Graphics processing unit; Instruction sets; Kernel; Optimization; Registers; Sparse matrices; Vectors; CUDA; GPU; SpMV; sparse linear algebra;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
  • Conference_Location
    Liverpool
  • Print_ISBN
    978-1-4673-2164-8
  • Type

    conf

  • DOI
    10.1109/HPCC.2012.68
  • Filename
    6332207