مرکز منطقه ای اطلاع رساني علوم و فناوري - Parallelizing the Hamiltonian Computation in DQMC Simulations: Checkerboard Method for Sparse Matrix Exponentials on Multicore and GPU

DocumentCode :

2991429

Title :

Parallelizing the Hamiltonian Computation in DQMC Simulations: Checkerboard Method for Sparse Matrix Exponentials on Multicore and GPU

Author :

Lee, Che-Rung ; Chen, Zhi-Hung ; Kao, Quey-Liang

Author_Institution :

Dept. Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan

fYear :

2012

fDate :

21-25 May 2012

Firstpage :

1889

Lastpage :

1897

Abstract :

Determinant Quantum Monte Carlo (DQMC) simulation is one of few numerical methods that can explore the micro properties of fermions, which has many technically important applications in chemistry and material science. Conventionally, its parallelization relies on parallel Monte Carlo method, whose speedup is limited by the thermalization process and the underlying matrix computation. To achieve better performance, fine-grained parallelization on its numerical kernel is essential to utilize the massive parallel processing units, which are multicores and/or GPUs interconnected by high performance network. In this paper, we address the parallelization on one of the matrix kernel in the DQMC simulations: the multiplication of matrix exponentials. The matrix is derived from the kinetic Hamiltonian, which is highly sparse. We approximate its exponential by the checkerboard method, which decomposes the matrix exponential into a product of a sequence of block sparse matrices. We analyze the block sparse matrices of two common used lattice geometry: 2D torus and 3D cubic, and parallelize the computational kernel of multiplying them to a general matrix. The parallel algorithm is designed for multicore CPU and GPU. The results of experiments showed on a quad core processor, 3 times speedup can be observed in average, and on GPU, 145 times speedup is achievable.

Keywords :

Monte Carlo methods; determinants; digital simulation; fermion systems; fermions; graphics processing units; materials science computing; matrix decomposition; matrix multiplication; multiprocessing systems; parallel algorithms; quantum theory; sparse matrices; 2D torus; 3D cubic; DQMC simulation; GPU; Hamiltonian computation parallelization; block sparse matrix; checkerboard method; chemistry; computational kernel; determinant quantum Monte Carlo simulation; fermion microproperties; fine-grained parallelization; high performance network; kinetic Hamiltonian; lattice geometry; massive parallel processing unit; material science; matrix computation; matrix exponential decomposition; matrix exponential multiplication; matrix kernel; multicore CPU; numerical kernel; numerical method; parallel algorithm; quad core processor; sparse matrix exponentials; thermalization process; Graphics processing unit; Kinetic theory; Lattices; Matrix decomposition; Multicore processing; Sparse matrices; Symmetric matrices; GPU; Matrix exponential; Multicore; Quantum Monte Carlo Simulation; Sparse matrices;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International

Conference_Location :

Shanghai

Print_ISBN :

978-1-4673-0974-5

Type :

conf

DOI :

10.1109/IPDPSW.2012.233

Filename :

6270392

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2991429