Title of article :
A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
Author/Authors :
Garvey, Joseph D. Edward S. Rogers Sr. Department of Electrical and Computer Engineering - University of Toronto , Abdelrahman, Tarek S. Edward S. Rogers Sr. Department of Electrical and Computer Engineering - University of Toronto
Pages :
25
From page :
1
To page :
25
Abstract :
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computations on Graphics Processing Units. The strategy uses a machine learning model to predict the optimal way to load data from memory followed by a heuristic that divides other optimizations into groups and exhaustively explores one group at a time. We use a set of 104 synthetic OpenCL stencil benchmarks that are representative of many real stencil computations. We first demonstrate the need for auto-tuning by showing that the optimization space is sufficiently complex that simple approaches to determining a high-performing configuration fail. We then demonstrate the effectiveness of our approach on NVIDIA and AMD GPUs. Relative to a random sampling of the space, we find configurations that are 12%/32% faster on the NVIDIA/AMD platform in 71% and 4% less time, respectively. Relative to an expert search, we achieve 5% and 9% better performance on the two platforms in 89% and 76% less time. We also evaluate our strategy for different stencil computational intensities, varying array sizes and shapes, and in combination with expert search.
Keywords :
A Strategy , GPUs , Automatic Performance Tuning , Stencil Computations
Journal title :
Scientific Programming
Serial Year :
2018
Full Text URL :
Record number :
2608817
Link To Document :
بازگشت