DocumentCode :
3137632
Title :
A Fast Implementation of Matrix-matrix Product in Double-double Precision on NVIDIA C2050 and Application to Semidefinite Programming
Author :
Nakata, Mitsuru ; Takao, Y. ; Noda, Satoshi ; Himeno, Ryutaro
Author_Institution :
Adv. Center for Comput. & Commun. 2F, RIKEN, Wako, Japan
fYear :
2012
fDate :
5-7 Dec. 2012
Firstpage :
68
Lastpage :
75
Abstract :
We have implemented a fast double-double precision (has approx. 32 decimal significant digits) version of matrix-matrix multiplication routine called "Rgemm" of MPACK (http://mplapack.sourceforge.net/) on NVIDIA C2050 GPU. This routine is a higher precision version of gdgemmh in the BLAS (Basic Linear Algebra Subprograms) library. Our implementation is the fastest to date using NVIDIA C2050 and most efficient on NVIDIA GPUs, we achieved the peak performances of 16.4GFlops for the kernel performance (16.1GFlops with CPU-GPU transfer included), and 26.4GFlops (25.7GFlops with CPU-GPU transfer included) by employing lower accuracy arithmetic. These are 92.3% (90.7%) and 87.1% (84.8%) of the theoretical peak performance of NVIDIA C2050, which is about 150 times faster than the reference implementation on Intel Xeon X3470. Moreover, our implementations can handle arbitrary sizes of matrices by employing gPointer redirectingh technique by Nath et al. We integrated this GPU-accelerated version of Rgemm for double-double precision version of semi definite programming solver called SDPA-DD, and the performance improved at most 14.5 times. This version of Rgemm is available at http://mplapack.sourceforge.net/ since 2011/10/28.
Keywords :
graphics processing units; mathematical programming; matrix multiplication; BLAS library; GPU-accelerated version; Intel Xeon X3470; MPACK; NVIDIA C2050 GPU; Rgemm; SDPA-DD; arbitrary matrices sizes; basic linear algebra subprograms library; dgemm; double-double precision version; kernel performance; matrix-matrix multiplication routine; matrix-matrix product implementation; pointer redirecting technique; semidefinite programming; Electronic mail; Graphics processing units; Instruction sets; Kernel; Libraries; Linear algebra; Programming; BLAS; GPU; MPACK; double-double precision; multiple precision;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Networking and Computing (ICNC), 2012 Third International Conference on
Conference_Location :
Okinawa
Print_ISBN :
978-1-4673-4624-5
Type :
conf
DOI :
10.1109/ICNC.2012.19
Filename :
6424545
Link To Document :
بازگشت