DocumentCode :
3706504
Title :
Automatic Performance Tuning of Stencil Computations on GPUs
Author :
Joseph D. Garvey;Tarek S. Abdelrahman
Author_Institution :
Edward S. Rogers Sr. Dept. of Electr. &
fYear :
2015
Firstpage :
300
Lastpage :
309
Abstract :
We consider automatic performance tuning of stencil computations on Graphics Processing Units. We present a strategy that uses machine learning to determine the best way to use memory followed by a heuristic that divides the remaining optimizations into groups and exhaustively explores one group at a time. We evaluate our strategy using 102 synthetically generated OpenCL stencil kernels on an Nvidia GTX Titan GPU. We assess our strategy both in terms of the number of configurations explored during auto-tuning and the quality of the best configuration obtained. We explore two alternative heuristics that use different groupings of the optimizations. We show that, relative to a random sampling of the space and an expert search, our strategy achieves a reduction in the number of configurations explored of up to 80% and 84% respectively while also finding better performing configurations.
Keywords :
"Optimization","Kernel","Merging","Yttrium","Graphics processing units","Parallel processing","Instruction sets"
Publisher :
ieee
Conference_Titel :
Parallel Processing (ICPP), 2015 44th International Conference on
ISSN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2015.39
Filename :
7349585
Link To Document :
بازگشت