DocumentCode :
3740656
Title :
Memory-Efficient Parallelization of 3D Lattice Boltzmann Flow Solver on a GPU
Author :
Nhat-Phuong Tran;Myungho Lee;Dong Hoon Choi
Author_Institution :
Dept. of Comput. Sci. &
fYear :
2015
Firstpage :
315
Lastpage :
324
Abstract :
Lattice Boltzmann Method (LBM) is a powerful numerical simulation method of the fluid flow. With its data parallel nature and the simple kernel structure, it is a promising candidate for a parallel implementation on a GPU. The LBM, however, is heavily data-intensive and memory bound. In particular, moving the data to the adjacent cells in the streaming computation phase of the LBM incurs a lot of uncoalesced accesses on the GPU which affects the overall performance. In this paper, we parallelize the LBM on a GPU by incorporating memory-efficient techniques such as the tiling optimization with the data layout changes and the data update scheme so called a pull scheme. Furthermore, we developed optimization techniques such as removing branch divergences, reducing the register uses, and reducing the number of double precision floating-point instructions. Experimental results on Nvidia Tesla K20 GPU show that our approach delivers up to 1105 MLUPS (Million Lattice Updates Per Second) and 156-times speedup compared with a serial implementation.
Keywords :
"Graphics processing units","Instruction sets","Registers","Optimization","Computer architecture","Lattice Boltzmann methods"
Publisher :
ieee
Conference_Titel :
High Performance Computing (HiPC), 2015 IEEE 22nd International Conference on
Type :
conf
DOI :
10.1109/HiPC.2015.49
Filename :
7397646
Link To Document :
بازگشت