DocumentCode :
2991013
Title :
Understanding the impact of CUDA tuning techniques for Fermi
Author :
Torres, Yuri ; Gonzalez-Escribano, Arturo ; Llanos, Diego R.
Author_Institution :
Dept. de Inf., Univ. de Valladolid, Valladolid, Spain
fYear :
2011
fDate :
4-8 July 2011
Firstpage :
631
Lastpage :
639
Abstract :
While the correctness of an NVIDIA CUDA program is easy to achieve, exploiting the GPU capabilities to obtain the best performance possible is a task for CUDA experienced programmers. Typical code tuning strategies, like choosing an appropriate size and shape for the thread blocks, programming a good coalescing, or maximize occupancy, are inter-dependent. Moreover, the choices are also dependent on the underlying architecture details, and the global-memory access pattern of the designed solution. For example, the size and shapes of threadblocks are usually chosen to facilitate encoding (e.g. square shapes), while maximizing the multiprocessors´ occupancy. How ever, this simple choice does not usually provide the best performance results. In this paper we discuss important relations between the size and shapes of threadblocks, occupancy, global memory access patterns, and other Fermi architecture features, such as the configuration of the new transparent cache. We present an insight based approach to tuning techniques, providing lines to understand the complex relations, and to easily avoid bad tuning settings.
Keywords :
cache storage; coprocessors; multiprocessing systems; CUDA experienced programmers; CUDA tuning techniques; Fermi architecture; GPU capabilities; NVIDIA CUDA program; encoding; global memory access patterns; global-memory access pattern; multiprocessors; threadblocks; transparent cache; Cache memory; Graphics processing unit; Hardware; Instruction sets; Kernel; Shape; Tuning; Fermi; GPU; code tuning; performance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Simulation (HPCS), 2011 International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-61284-380-3
Type :
conf
DOI :
10.1109/HPCSim.2011.5999886
Filename :
5999886
Link To Document :
بازگشت