• DocumentCode
    2996809
  • Title

    A Fast Parallel Implementation of Molecular Dynamics with the Morse Potential on a Heterogeneous Petascale Supercomputer

  • Author

    Wu, Qiang ; Yang, Canqun ; Wang, Feng ; Xue, Jingling

  • Author_Institution
    Sch. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    140
  • Lastpage
    149
  • Abstract
    Molecular Dynamics (MD) simulations have been widely used in the study of macromolecules. To ensure an acceptable level of statistical accuracy relatively large number of particles are needed, which calls for high performance implementations of MD. These days heterogeneous systems, with their high performance potential, low power consumption, and high price-performance ratio, offer a viable alternative for running MD simulations. In this paper we introduce a fast parallel implementation of MD simulation with the Morse potential on Tianhe-1A, a petascale heterogeneous supercomputer. Our code achieves a speedup of 3.6× on one NVIDIA Tesla M2050 GPU (containing 14 Streaming Multiprocessors) compared to a 2.93GHz six-core Intel Xeon X5670 CPU. In addition, our code runs faster on 1024 compute nodes (with two CPUs and one GPU inside a node) than on 4096 GPU-excluded nodes, effectively rendering one GPU more efficient than six six-core CPUs. Our work shows that large-scale MD simulations can benefit enormously from GPU acceleration in petascale supercomputing platforms. Our performance results are achieved by using (1) a patch-cell design to exploit parallelism across the simulation domain, (2) a new GPU kernel developed by taking advantage of Newton´s Third Law to reduce redundant force computation on GPUs, (3) two optimization methods including a dynamic load balancing strategy that adjusts the workload, and a communication overlapping method to overlap the communications between CPUs and GPUs.
  • Keywords
    Morse potential; chemistry computing; graphics processing units; macromolecules; molecular dynamics method; parallel machines; resource allocation; statistical analysis; GPU acceleration; GPU kernel; GPU-excluded node; MD simulation; Morse potential; NVIDIA Tesla M2050 GPU; Newton Third Law; Tianhe-1A; communication overlapping method; dynamic load balancing strategy; fast parallel implementation; heterogeneous petascale supercomputer; heterogeneous systems; macromolecules; molecular dynamics simulation; optimization method; parallelism; patch-cell design; petascale supercomputing platform; redundant force computation; six-core Intel Xeon X5670 CPU; statistical accuracy; streaming multiprocessors; workload adjustment; Computational modeling; Force; Graphics processing unit; Indexes; Instruction sets; Kernel; Mathematical model; GPU computing; Molecular Dynamics; heterogeneous computing; petascale supercomputer;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4673-0974-5
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2012.13
  • Filename
    6270634