DocumentCode
2524623
Title
Multi-kernel Auto-Tuning on GPUs: Performance and Energy-Aware Optimization
Author
Guerreiro, Joao ; Ilic, Aleksandar ; Roma, Nuno ; Tomas, Pedro
Author_Institution
Inst. Super. Tecnico, Univ. de Lisboa, Lisbon, Portugal
fYear
2015
fDate
4-6 March 2015
Firstpage
438
Lastpage
445
Abstract
Prompted by their very high computational capabilities and memory bandwidth, Graphics Processing Units (GPUs) are already widely used to accelerate the execution of many scientific applications. However, programmers are still required to have a very detailed knowledge of the GPU internal architecture when tuning the kernels, in order to improve either performance or energy-efficiency. Moreover, different GPU devices have different characteristics, moving a kernel to a different GPU typically requires re-tuning the kernel execution, in order to efficiently exploit the underlying hardware. The procedure proposed in this work is based on real-time kernel profiling and GPU monitoring and it automatically tunes parameters from several concurrent kernels to maximize the performance or minimize the energy consumption. Experimental results on NVIDIA GPU devices with up to 4 concurrent kernels show that the proposed solution achieves near optimal configurations. Furthermore, significant energy savings can be achieved by using the proposed energy-efficiency auto-tuning procedure.
Keywords
graphics processing units; performance evaluation; power aware computing; GPU devices; GPU internal architecture; GPU monitoring; NVIDIA GPU devices; energy aware optimization; energy efficiency autotuning procedure; graphics processing units; memory bandwidth; multikernel auto tuning; performance optimization; real-time kernel profiling; scientific applications; Energy consumption; Frequency measurement; Graphics processing units; Instruction sets; Kernel; Optimization; Performance evaluation; CUDA; GPGPU; GPU; OpenCL; auto-tuning; energy-awareness; multikernel;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel, Distributed and Network-Based Processing (PDP), 2015 23rd Euromicro International Conference on
Conference_Location
Turku
ISSN
1066-6192
Type
conf
DOI
10.1109/PDP.2015.44
Filename
7092758
Link To Document