A Fast Implementation of Matrix-matrix Product in Double-double Precision on NVIDIA C2050 and Application to Semidefinite Programming

Author

Nakata, Mitsuru ; Takao, Y. ; Noda, Satoshi ; Himeno, Ryutaro

Author_Institution

Adv. Center for Comput. & Commun. 2F, RIKEN, Wako, Japan

fYear

2012

fDate

5-7 Dec. 2012

Firstpage

68

Lastpage

75

Abstract

We have implemented a fast double-double precision (has approx. 32 decimal significant digits) version of matrix-matrix multiplication routine called "Rgemm" of MPACK (http://mplapack.sourceforge.net/) on NVIDIA C2050 GPU. This routine is a higher precision version of gdgemmh in the BLAS (Basic Linear Algebra Subprograms) library. Our implementation is the fastest to date using NVIDIA C2050 and most efficient on NVIDIA GPUs, we achieved the peak performances of 16.4GFlops for the kernel performance (16.1GFlops with CPU-GPU transfer included), and 26.4GFlops (25.7GFlops with CPU-GPU transfer included) by employing lower accuracy arithmetic. These are 92.3% (90.7%) and 87.1% (84.8%) of the theoretical peak performance of NVIDIA C2050, which is about 150 times faster than the reference implementation on Intel Xeon X3470. Moreover, our implementations can handle arbitrary sizes of matrices by employing gPointer redirectingh technique by Nath et al. We integrated this GPU-accelerated version of Rgemm for double-double precision version of semi definite programming solver called SDPA-DD, and the performance improved at most 14.5 times. This version of Rgemm is available at http://mplapack.sourceforge.net/ since 2011/10/28.

Keywords

graphics processing units; mathematical programming; matrix multiplication; BLAS library; GPU-accelerated version; Intel Xeon X3470; MPACK; NVIDIA C2050 GPU; Rgemm; SDPA-DD; arbitrary matrices sizes; basic linear algebra subprograms library; dgemm; double-double precision version; kernel performance; matrix-matrix multiplication routine; matrix-matrix product implementation; pointer redirecting technique; semidefinite programming; Electronic mail; Graphics processing units; Instruction sets; Kernel; Libraries; Linear algebra; Programming; BLAS; GPU; MPACK; double-double precision; multiple precision;

fLanguage

English

Publisher

ieee

Conference_Titel

Networking and Computing (ICNC), 2012 Third International Conference on

Conference_Location

Okinawa

Print_ISBN

978-1-4673-4624-5

Type

conf

DOI

10.1109/ICNC.2012.19

Filename

6424545