DocumentCode :
1877568
Title :
Accelerating Strassen-Winograd´s matrix multiplication algorithm on GPUs
Author :
Pai-Wei Lai ; Arafat, Humayun ; Elango, Venmugil ; Sadayappan, P.
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2013
fDate :
18-21 Dec. 2013
Firstpage :
139
Lastpage :
148
Abstract :
In this paper, we report on the development of an efficient GPU implementation of the Strassen-Winograd matrix multiplication algorithm for matrices of arbitrary sizes. We utilize multi-kernel streaming to exploit concurrency across sub-matrix operations in addition to intra-operation parallelism. We evaluate the performance of the implementation in comparison with CUBLAS-5.0 on Fermi and Kepler GPUs. The experimental results demonstrate the usefulness of Strassen´s algorithm for practically relevant matrix sizes on GPUs, with up to 1.27X speedup for single-precision and 1.42X speedup for double-precision floating point computation.
Keywords :
floating point arithmetic; graphics processing units; matrix multiplication; performance evaluation; CUBLAS-5.0; Fermi GPU; GPU implementation; Kepler GPU; Strassen´s algorithm; Strassen-Winograd matrix multiplication algorithm; double-precision floating point computation; intraoperation parallelism; multikernel streaming; performance evaluation; submatrix operations; Algorithm design and analysis; Computational modeling; Graphics processing units; Instruction sets; Kernel; Memory management; Standards;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing (HiPC), 2013 20th International Conference on
Conference_Location :
Bangalore
Type :
conf
DOI :
10.1109/HiPC.2013.6799109
Filename :
6799109
Link To Document :
بازگشت