DocumentCode
1918837
Title
Phase-Based Profiling in GPGPU Kernels
Author
Dietrich, Robert ; Schmitt, Felix ; Widera, René ; Bussmann, Michael
Author_Institution
Center for Inf. Services & High Performance Comput. (ZIH), Tech. Univ. Dresden, Dresden, Germany
fYear
2012
fDate
10-13 Sept. 2012
Firstpage
414
Lastpage
423
Abstract
More and more computationally intensive scientific applications make use of hardware accelerators like general purpose graphics processing units (GPGPUs). Compared to software development for typical multi-core processors their programming is fairly complex and needs hardware specific optimizations to utilize the full computing power. To achieve high performance, critical parts of a program have to be identified and optimized. This paper proposes an approach for performance analysis of CUDA kernel source code regions, which for the first time allows measuring the execution times within GPGPU kernels. We developed a tool, which implements the presented method and supports the application developer to easily identify hot spots within the kernel. The presented tool uses compile time code analysis to automatically instrument suitable instrumentation points for minimal program perturbation and further provides support for manual instrumentation. To the best of our knowledge this is the first approach, which allows for scalable runtime analysis within GPGPU kernels. Combined with existing performance analysis techniques this facilitates obtaining the full potential of modern parallel systems.
Keywords
graphics processing units; multiprocessing systems; parallel architectures; program diagnostics; software metrics; CUDA; CUDA kernel source code regions; GPGPU kernels; compile time code analysis; execution time measurement; general purpose graphics processing units; hardware accelerators; instrumentation points; minimal program perturbation; multicore processors; parallel systems; performance analysis techniques; phase-based profiling; program identification; program optimization; runtime analysis; scientific applications; software development; Graphics processing unit; Hardware; Instruction sets; Instruments; Kernel; Radiation detectors; Runtime; CUDA; GPGPU; accelerators; many-core; performance analysis; profiling; tracing;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing Workshops (ICPPW), 2012 41st International Conference on
Conference_Location
Pittsburgh, PA
ISSN
1530-2016
Print_ISBN
978-1-4673-2509-7
Type
conf
DOI
10.1109/ICPPW.2012.59
Filename
6337509
Link To Document