• DocumentCode
    3001128
  • Title

    Sparse Matrix-vector Multiplication on GPGPU Clusters: A New Storage Format and a Scalable Implementation

  • Author

    Kreutzer, Moritz ; Hager, Georg ; Wellein, Gerhard ; Fehske, Holger ; Basermann, Achim ; Bishop, Alan R.

  • Author_Institution
    Erlangen Regional Comput. Center, Erlangen, Germany
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    1696
  • Lastpage
    1702
  • Abstract
    Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia "Fermi" class of GPGPUs. A new "padded jagged diagonals storage" (pJDS) format is proposed which may substantially reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme while making no assumptions about the matrix structure. In our test scenarios the pJDS format cuts the overall spMVM memory footprint on the GPGPU by up to 70%, and achieves 91% to 130% of the ELLPACK-R performance. Using a suitable performance model we identify performance bottlenecks on the node level that invalidate some types of matrix structures for efficient multi-GPGPU parallelization. For appropriate sparsity patterns we extend previous work on distributed-memory parallel spMVM to demonstrate a scalable hybrid MPI-GPGPU code, achieving efficient overlap of communication and computation.
  • Keywords
    graphics processing units; sparse matrices; support vector machines; ELLPACK-R scheme; GPGPU clusters; matrix structure; nVidia Fermi class; new storage format; padded jagged diagonals storage; scalable implementation; spMVM; sparse matrix vector multiplication; sparse solvers; sparsity patterns; Bandwidth; Computational modeling; Error correction codes; Instruction sets; Kernel; Sparse matrices; Vectors; CUDA; GPGPU; Sparse matrices;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4673-0974-5
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2012.211
  • Filename
    6270844