DocumentCode
592848
Title
Accelerating block checkerboard method on GPU for performance enhancement of 2D and 3D Quantum Monte Carlo simulations
Author
Chi-Cheng Chuang ; Yu-Sheng Chiu ; Quey-Liang Kao ; Zhi-Hung Chen ; Che-Rung Lee
Author_Institution
Inst. for Inf. Ind., Smart Network Syst. Inst., Taipei, Taiwan
fYear
2012
fDate
3-6 Dec. 2012
Firstpage
717
Lastpage
722
Abstract
Quantum Monte Carlo (QMC) simulations for the recent studies on complex materials were confronted by new computational challenges. Traditional approach to accelerate the simulations by parallel Monte Carlo chains faces serious scalability problems since the speedup is reaching the limitation predicted by Amdahl´s law. Fine-grained parallelization of matrix kernels is essential to achieve better performance. In this paper, we investigate the performance optimization techniques on GPU for the most time consuming computational kernel in the Determinant Quantum Monte Carlo (DQMC) simulation: multiplication of matrix exponentials. The matrix, derived from the kinetic Hamiltonian, is highly sparse, and its exponential is approximated by the block checkerboard method, which can represent a matrix exponential as a product of a sequence of sparse matrices. The matrix exponentials from 2D and 3D toruses are focused, and various optimization techniques, such as data streaming and concurrent kernels, are proposed. Experiments show that the proposed optimization techniques can improve the SpMM (Sparse Matrix Multiplication) function, modified from the CUDA SKD SpMV function, up to 16 times and 117 times for 2D and 3D problems respectively.
Keywords
Monte Carlo methods; approximation theory; determinants; graphics processing units; matrix multiplication; optimisation; parallel architectures; parallel machines; quantum computing; sparse matrices; 2D quantum Monte Carlo simulation; 2D toruses; 3D quantum Monte Carlo simulation; 3D toruses; Amdahl law; CUDA SKD SpMV function; GPU; SpMM; block checkerboard acceleration method; complex material; determinant quantum Monte Carlo; fine grained parallelization; kinetic Hamiltonian; matrix exponential approximation; matrix exponential multiplication; matrix kernel; parallel Monte Carlo method; performance optimization; scalability problem; sparse matrix multiplication; sparse matrix sequence; Computational modeling; Graphics processing units; Kernel; Kinetic theory; Monte Carlo methods; Optimization; Sparse matrices; GPU; Matrix exponential; Quantum Monte Carlo Simulation; Sparse matrices;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on
Conference_Location
Taipei
Print_ISBN
978-1-4673-4511-8
Electronic_ISBN
978-1-4673-4509-5
Type
conf
DOI
10.1109/CloudCom.2012.6427564
Filename
6427564
Link To Document