DocumentCode :
3024991
Title :
An implementation of the QR iterations for finding eigenvalues of matrices with CUDA on GPU
Author :
Meirui Ren ; Weiping Zhang ; Tao Wang ; Ning Tian ; Jinbao Li ; Longjiang Guo
Author_Institution :
Sch. of Comput. Sci. & Technol., Heilongjiang Univ., Harbin, China
fYear :
2013
fDate :
20-22 Dec. 2013
Firstpage :
1572
Lastpage :
1577
Abstract :
Based on the principle of QR loop iterations, this paper implements a parallel algorithm based on the hardware of GPU (Graphic Process Unit) by using routines from CUDA (Computer Unified Device Architecture) to find the eigenvalues of general matrices. CPU and GPU computing card form a client-server computing framework. Here, CPU can be regarded as a client and GPU card is considered as a computing server. For the experiment environment, this paper chooses a GPU card with the model of NVIDIA GeForce GTX460 as a server side and a CPU with the model of Intel Core i5-760 quad-core as a client side. Win7 64-bit is selected as the operating system. The parallel implementation consists of two parts:PA H and PA QR. PA H is a procedure that transforms a matrix A into the Hessenberg matrix B. PA QR is the actual parallel algorithm of the QR iterations that is imposed on the Hessenberg matrix B for finding eigenvalues of the general matrix A. The speedup ratio of the proposed algorithm is jarless when the number of iterations becomes greater. The experimental results show that the parallel implementation with CUDA on GPU only makes use of less running time than traditional sequential algorithms. The speedup ratio of PA H is between 1.79 and 7.81. The speedup ratio of PA QR is between 3.24 and 118.9. Especially, when the order of general matrix equals 8192, the amount of iterations becomes 10000. The speedup ratio of the PA-H and the PA QR can run up to 7.81 and 118.9 respectively.
Keywords :
client-server systems; eigenvalues and eigenfunctions; graphics processing units; iterative methods; mathematics computing; matrix algebra; operating systems (computers); parallel algorithms; parallel architectures; CPU computing card; CUDA; GPU computing card; GPU hardware; Hessenberg matrix; Intel Core is-760 quad-core; NVIDIA GeForce GTX460; PA_H; PA_QR; QR loop iterations; Win7 64-bit operating system; client-server computing framework; computer unified device architecture; graphic process unit; matrix eigenvalues; parallel algorithm; Eigenvalues and eigenfunctions; Graphics processing units; Hardware; Instruction sets; Kernel; CUDA; Eigenvalue; General matrix; QR iteration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mechatronic Sciences, Electric Engineering and Computer (MEC), Proceedings 2013 International Conference on
Conference_Location :
Shengyang
Print_ISBN :
978-1-4799-2564-3
Type :
conf
DOI :
10.1109/MEC.2013.6885312
Filename :
6885312
Link To Document :
بازگشت