Title :
A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs
Author :
Guanghao Jin ; Endo, T. ; Matsuoka, Shingo
Author_Institution :
Tokyo Inst. of Technol., JST-CREST, Tokyo, Japan
Abstract :
The problem size of the stencil computation on GPU cluster is limited by the memory capacity GPUs, which is typically smaller than that of host memories. This paper proposes and evaluates parallel optimization method for stencil computation to achieve scalability, larger problem size than the memory capacity of GPUs and high performance. It uses 2D decomposition to achieve scalability over GPUs. Then it enables bigger sub-domain on each GPU to achieve bigger problem size. It applies temporal blocking method to improve memory access locality of stencil computation and reuses former result to solve redundant problem to get higher performance. Evaluation of stencil simulation on 3D domain shows that our new method for 7-point and 19-point on GPUs achieves good scalability which is 1.45 times and 1.72 times better than other methods on average.
Keywords :
graphics processing units; optimisation; storage management; 2D decomposition; GPU cluster; memory access locality; memory capacity; parallel optimization; stencil computation; temporal blocking method; Graphics processing units; Radio frequency; GPU cluster; memory capacity; parallel optimization; stencil computation; temporal blocking;
Conference_Titel :
Cluster Computing (CLUSTER), 2013 IEEE International Conference on
Conference_Location :
Indianapolis, IN
DOI :
10.1109/CLUSTER.2013.6702633