DocumentCode :
3596105
Title :
Effective sampling-driven performance tools for GPU-accelerated supercomputers
Author :
Chabbi, Milind ; Murthy, K. ; Fagan, Michael ; Mellor-Crummey, John
Author_Institution :
Dept. of Comput. Sci., Rice Univ. Houston, Houston, TX, USA
fYear :
2013
Firstpage :
1
Lastpage :
12
Abstract :
Performance analysis of GPU-accelerated systems requires a system-wide view that considers both CPU and GPU components. In this paper, we describe how to extend system-wide, sampling-based performance analysis methods to GPU-accelerated systems. Since current GPUs do not support sampling, our implementation required careful coordination of instrumentation-based performance data collection on GPUs with sampling-based methods employed on CPUs. In addition, we also introduce a novel technique for analyzing systemic idleness in CPU/GPU systems. We demonstrate the effectiveness of our techniques with application case studies on Titan and Keeneland. Some of the highlights of our case studies are: 1) we improved performance for LULESH 1.0 by 30%, 2) we identified a hardware performance problem on Keeneland, 3) we identified a scaling problem in LAMMPS derived from CUDA initialization, and 4) we identified a performance problem that is caused by GPU synchronization operations that suffer delays due to blocking system calls.
Keywords :
graphics processing units; parallel architectures; parallel machines; performance evaluation; sampling methods; synchronisation; CPU-GPU systems; CUDA initialization; GPU synchronization operations; GPU-accelerated supercomputers; GPU-accelerated systems; Keeneland; LAMMPS; LULESH 1.0; Titan; blocking system calls; hardware performance problem; instrumentation-based performance data collection; sampling-driven performance tools; system-wide sampling-based performance analysis methods; Context; Graphics processing units; Instruments; Kernel; Measurement; Radiation detectors; Tuning; CPU-GPU blame shifting; Call path profiling; Heterogeneous architectures; Performance analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2013 International Conference for
Print_ISBN :
978-1-4503-2378-9
Type :
conf
DOI :
10.1145/2503210.2503299
Filename :
6877476
Link To Document :
بازگشت