• DocumentCode
    592848
  • Title

    Accelerating block checkerboard method on GPU for performance enhancement of 2D and 3D Quantum Monte Carlo simulations

  • Author

    Chi-Cheng Chuang ; Yu-Sheng Chiu ; Quey-Liang Kao ; Zhi-Hung Chen ; Che-Rung Lee

  • Author_Institution
    Inst. for Inf. Ind., Smart Network Syst. Inst., Taipei, Taiwan
  • fYear
    2012
  • fDate
    3-6 Dec. 2012
  • Firstpage
    717
  • Lastpage
    722
  • Abstract
    Quantum Monte Carlo (QMC) simulations for the recent studies on complex materials were confronted by new computational challenges. Traditional approach to accelerate the simulations by parallel Monte Carlo chains faces serious scalability problems since the speedup is reaching the limitation predicted by Amdahl´s law. Fine-grained parallelization of matrix kernels is essential to achieve better performance. In this paper, we investigate the performance optimization techniques on GPU for the most time consuming computational kernel in the Determinant Quantum Monte Carlo (DQMC) simulation: multiplication of matrix exponentials. The matrix, derived from the kinetic Hamiltonian, is highly sparse, and its exponential is approximated by the block checkerboard method, which can represent a matrix exponential as a product of a sequence of sparse matrices. The matrix exponentials from 2D and 3D toruses are focused, and various optimization techniques, such as data streaming and concurrent kernels, are proposed. Experiments show that the proposed optimization techniques can improve the SpMM (Sparse Matrix Multiplication) function, modified from the CUDA SKD SpMV function, up to 16 times and 117 times for 2D and 3D problems respectively.
  • Keywords
    Monte Carlo methods; approximation theory; determinants; graphics processing units; matrix multiplication; optimisation; parallel architectures; parallel machines; quantum computing; sparse matrices; 2D quantum Monte Carlo simulation; 2D toruses; 3D quantum Monte Carlo simulation; 3D toruses; Amdahl law; CUDA SKD SpMV function; GPU; SpMM; block checkerboard acceleration method; complex material; determinant quantum Monte Carlo; fine grained parallelization; kinetic Hamiltonian; matrix exponential approximation; matrix exponential multiplication; matrix kernel; parallel Monte Carlo method; performance optimization; scalability problem; sparse matrix multiplication; sparse matrix sequence; Computational modeling; Graphics processing units; Kernel; Kinetic theory; Monte Carlo methods; Optimization; Sparse matrices; GPU; Matrix exponential; Quantum Monte Carlo Simulation; Sparse matrices;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on
  • Conference_Location
    Taipei
  • Print_ISBN
    978-1-4673-4511-8
  • Electronic_ISBN
    978-1-4673-4509-5
  • Type

    conf

  • DOI
    10.1109/CloudCom.2012.6427564
  • Filename
    6427564