DocumentCode :
2000149
Title :
Acceleration of a High Order Finite-Difference WENO Scheme for Large-Scale Cosmological Simulations on GPU
Author :
Chen Meng ; Long Wang ; Zongyan Cao ; Xianfeng Ye ; Long-Long Feng
Author_Institution :
Supercomput. Center, Comput. Network Inf. Center, Beijing, China
fYear :
2013
fDate :
20-24 May 2013
Firstpage :
2071
Lastpage :
2078
Abstract :
In this work, we present our implementation of a three-dimensional 5th order finite-difference weighted essentially non-oscillatory (WENO) scheme in double precision on CPU/GPU clusters, which targets on large-scale cosmological hydrodynamic flow simulations involving both shocks and complicated smooth solution structures. In the level of MPI parallelization, we subdivided the domain along each of three axial directions. Then on each process, we ported the WENO computation to GPU. This method is memory-bound derived from the calculations of the weights and it becomes a greater challenge for a 3D high order problem in double precision. To make full use of impressive computing power of GPU and avoid its memory limitation, we performed a series of optimizations that are focused on memory accessing mode at all levels. We subjected this code to a number of typical tests for the evaluation of effectiveness and efficiency. Our tests indicate that, in a mono-thread Fortran code reference, the GPU version achieves a 12~19 speed-up and about 19~36 in the computation part. We analyzed the results on both Fermi and Kepler GPUs. We also outlined what is needed to further increase the speed by reducing the time spent on the communications part and other future work.
Keywords :
astronomy computing; cosmology; finite difference methods; flow simulation; graphics processing units; hydrodynamics; message passing; CPU-GPU clusters; Fermi GPU; Kepler GPU; MPI parallelization; axial directions; graphics processing unit; high order finite-difference WENO scheme; large-scale cosmological hydrodynamic flow simulations; memory limitation; message passing interface; monothread Fortran code reference; solution structures; weighted essentially nonoscillatory scheme; Electric shock; Equations; Graphics processing units; Instruction sets; Kernel; Mathematical model; Three-dimensional displays; 3D; GPU; WENO; cosmological hydrodynamic; double precision;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
Type :
conf
DOI :
10.1109/IPDPSW.2013.169
Filename :
6651112
Link To Document :
بازگشت