DocumentCode :
668130
Title :
A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs
Author :
Guanghao Jin ; Endo, T. ; Matsuoka, Shingo
Author_Institution :
Tokyo Inst. of Technol., JST-CREST, Tokyo, Japan
fYear :
2013
fDate :
23-27 Sept. 2013
Firstpage :
1
Lastpage :
8
Abstract :
The problem size of the stencil computation on GPU cluster is limited by the memory capacity GPUs, which is typically smaller than that of host memories. This paper proposes and evaluates parallel optimization method for stencil computation to achieve scalability, larger problem size than the memory capacity of GPUs and high performance. It uses 2D decomposition to achieve scalability over GPUs. Then it enables bigger sub-domain on each GPU to achieve bigger problem size. It applies temporal blocking method to improve memory access locality of stencil computation and reuses former result to solve redundant problem to get higher performance. Evaluation of stencil simulation on 3D domain shows that our new method for 7-point and 19-point on GPUs achieves good scalability which is 1.45 times and 1.72 times better than other methods on average.
Keywords :
graphics processing units; optimisation; storage management; 2D decomposition; GPU cluster; memory access locality; memory capacity; parallel optimization; stencil computation; temporal blocking method; Graphics processing units; Radio frequency; GPU cluster; memory capacity; parallel optimization; stencil computation; temporal blocking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing (CLUSTER), 2013 IEEE International Conference on
Conference_Location :
Indianapolis, IN
Type :
conf
DOI :
10.1109/CLUSTER.2013.6702633
Filename :
6702633
Link To Document :
بازگشت