Title :
Parallelizing the Hamiltonian Computation in DQMC Simulations: Checkerboard Method for Sparse Matrix Exponentials on Multicore and GPU
Author :
Lee, Che-Rung ; Chen, Zhi-Hung ; Kao, Quey-Liang
Author_Institution :
Dept. Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
Abstract :
Determinant Quantum Monte Carlo (DQMC) simulation is one of few numerical methods that can explore the micro properties of fermions, which has many technically important applications in chemistry and material science. Conventionally, its parallelization relies on parallel Monte Carlo method, whose speedup is limited by the thermalization process and the underlying matrix computation. To achieve better performance, fine-grained parallelization on its numerical kernel is essential to utilize the massive parallel processing units, which are multicores and/or GPUs interconnected by high performance network. In this paper, we address the parallelization on one of the matrix kernel in the DQMC simulations: the multiplication of matrix exponentials. The matrix is derived from the kinetic Hamiltonian, which is highly sparse. We approximate its exponential by the checkerboard method, which decomposes the matrix exponential into a product of a sequence of block sparse matrices. We analyze the block sparse matrices of two common used lattice geometry: 2D torus and 3D cubic, and parallelize the computational kernel of multiplying them to a general matrix. The parallel algorithm is designed for multicore CPU and GPU. The results of experiments showed on a quad core processor, 3 times speedup can be observed in average, and on GPU, 145 times speedup is achievable.
Keywords :
Monte Carlo methods; determinants; digital simulation; fermion systems; fermions; graphics processing units; materials science computing; matrix decomposition; matrix multiplication; multiprocessing systems; parallel algorithms; quantum theory; sparse matrices; 2D torus; 3D cubic; DQMC simulation; GPU; Hamiltonian computation parallelization; block sparse matrix; checkerboard method; chemistry; computational kernel; determinant quantum Monte Carlo simulation; fermion microproperties; fine-grained parallelization; high performance network; kinetic Hamiltonian; lattice geometry; massive parallel processing unit; material science; matrix computation; matrix exponential decomposition; matrix exponential multiplication; matrix kernel; multicore CPU; numerical kernel; numerical method; parallel algorithm; quad core processor; sparse matrix exponentials; thermalization process; Graphics processing unit; Kinetic theory; Lattices; Matrix decomposition; Multicore processing; Sparse matrices; Symmetric matrices; GPU; Matrix exponential; Multicore; Quantum Monte Carlo Simulation; Sparse matrices;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
DOI :
10.1109/IPDPSW.2012.233