DocumentCode :
592848
Title :
Accelerating block checkerboard method on GPU for performance enhancement of 2D and 3D Quantum Monte Carlo simulations
Author :
Chi-Cheng Chuang ; Yu-Sheng Chiu ; Quey-Liang Kao ; Zhi-Hung Chen ; Che-Rung Lee
Author_Institution :
Inst. for Inf. Ind., Smart Network Syst. Inst., Taipei, Taiwan
fYear :
2012
fDate :
3-6 Dec. 2012
Firstpage :
717
Lastpage :
722
Abstract :
Quantum Monte Carlo (QMC) simulations for the recent studies on complex materials were confronted by new computational challenges. Traditional approach to accelerate the simulations by parallel Monte Carlo chains faces serious scalability problems since the speedup is reaching the limitation predicted by Amdahl´s law. Fine-grained parallelization of matrix kernels is essential to achieve better performance. In this paper, we investigate the performance optimization techniques on GPU for the most time consuming computational kernel in the Determinant Quantum Monte Carlo (DQMC) simulation: multiplication of matrix exponentials. The matrix, derived from the kinetic Hamiltonian, is highly sparse, and its exponential is approximated by the block checkerboard method, which can represent a matrix exponential as a product of a sequence of sparse matrices. The matrix exponentials from 2D and 3D toruses are focused, and various optimization techniques, such as data streaming and concurrent kernels, are proposed. Experiments show that the proposed optimization techniques can improve the SpMM (Sparse Matrix Multiplication) function, modified from the CUDA SKD SpMV function, up to 16 times and 117 times for 2D and 3D problems respectively.
Keywords :
Monte Carlo methods; approximation theory; determinants; graphics processing units; matrix multiplication; optimisation; parallel architectures; parallel machines; quantum computing; sparse matrices; 2D quantum Monte Carlo simulation; 2D toruses; 3D quantum Monte Carlo simulation; 3D toruses; Amdahl law; CUDA SKD SpMV function; GPU; SpMM; block checkerboard acceleration method; complex material; determinant quantum Monte Carlo; fine grained parallelization; kinetic Hamiltonian; matrix exponential approximation; matrix exponential multiplication; matrix kernel; parallel Monte Carlo method; performance optimization; scalability problem; sparse matrix multiplication; sparse matrix sequence; Computational modeling; Graphics processing units; Kernel; Kinetic theory; Monte Carlo methods; Optimization; Sparse matrices; GPU; Matrix exponential; Quantum Monte Carlo Simulation; Sparse matrices;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4673-4511-8
Electronic_ISBN :
978-1-4673-4509-5
Type :
conf
DOI :
10.1109/CloudCom.2012.6427564
Filename :
6427564
Link To Document :
بازگشت