• DocumentCode
    3740656
  • Title

    Memory-Efficient Parallelization of 3D Lattice Boltzmann Flow Solver on a GPU

  • Author

    Nhat-Phuong Tran;Myungho Lee;Dong Hoon Choi

  • Author_Institution
    Dept. of Comput. Sci. &
  • fYear
    2015
  • Firstpage
    315
  • Lastpage
    324
  • Abstract
    Lattice Boltzmann Method (LBM) is a powerful numerical simulation method of the fluid flow. With its data parallel nature and the simple kernel structure, it is a promising candidate for a parallel implementation on a GPU. The LBM, however, is heavily data-intensive and memory bound. In particular, moving the data to the adjacent cells in the streaming computation phase of the LBM incurs a lot of uncoalesced accesses on the GPU which affects the overall performance. In this paper, we parallelize the LBM on a GPU by incorporating memory-efficient techniques such as the tiling optimization with the data layout changes and the data update scheme so called a pull scheme. Furthermore, we developed optimization techniques such as removing branch divergences, reducing the register uses, and reducing the number of double precision floating-point instructions. Experimental results on Nvidia Tesla K20 GPU show that our approach delivers up to 1105 MLUPS (Million Lattice Updates Per Second) and 156-times speedup compared with a serial implementation.
  • Keywords
    "Graphics processing units","Instruction sets","Registers","Optimization","Computer architecture","Lattice Boltzmann methods"
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing (HiPC), 2015 IEEE 22nd International Conference on
  • Type

    conf

  • DOI
    10.1109/HiPC.2015.49
  • Filename
    7397646