• DocumentCode
    1918837
  • Title

    Phase-Based Profiling in GPGPU Kernels

  • Author

    Dietrich, Robert ; Schmitt, Felix ; Widera, René ; Bussmann, Michael

  • Author_Institution
    Center for Inf. Services & High Performance Comput. (ZIH), Tech. Univ. Dresden, Dresden, Germany
  • fYear
    2012
  • fDate
    10-13 Sept. 2012
  • Firstpage
    414
  • Lastpage
    423
  • Abstract
    More and more computationally intensive scientific applications make use of hardware accelerators like general purpose graphics processing units (GPGPUs). Compared to software development for typical multi-core processors their programming is fairly complex and needs hardware specific optimizations to utilize the full computing power. To achieve high performance, critical parts of a program have to be identified and optimized. This paper proposes an approach for performance analysis of CUDA kernel source code regions, which for the first time allows measuring the execution times within GPGPU kernels. We developed a tool, which implements the presented method and supports the application developer to easily identify hot spots within the kernel. The presented tool uses compile time code analysis to automatically instrument suitable instrumentation points for minimal program perturbation and further provides support for manual instrumentation. To the best of our knowledge this is the first approach, which allows for scalable runtime analysis within GPGPU kernels. Combined with existing performance analysis techniques this facilitates obtaining the full potential of modern parallel systems.
  • Keywords
    graphics processing units; multiprocessing systems; parallel architectures; program diagnostics; software metrics; CUDA; CUDA kernel source code regions; GPGPU kernels; compile time code analysis; execution time measurement; general purpose graphics processing units; hardware accelerators; instrumentation points; minimal program perturbation; multicore processors; parallel systems; performance analysis techniques; phase-based profiling; program identification; program optimization; runtime analysis; scientific applications; software development; Graphics processing unit; Hardware; Instruction sets; Instruments; Kernel; Radiation detectors; Runtime; CUDA; GPGPU; accelerators; many-core; performance analysis; profiling; tracing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing Workshops (ICPPW), 2012 41st International Conference on
  • Conference_Location
    Pittsburgh, PA
  • ISSN
    1530-2016
  • Print_ISBN
    978-1-4673-2509-7
  • Type

    conf

  • DOI
    10.1109/ICPPW.2012.59
  • Filename
    6337509