• DocumentCode
    2536282
  • Title

    Mixed-Tool Performance Analysis on Hybrid Multicore Architectures

  • Author

    Du, Peng ; Luszczek, Piotr ; Tomov, Stanimire ; Dongarra, Jack

  • fYear
    2010
  • fDate
    13-16 Sept. 2010
  • Firstpage
    236
  • Lastpage
    244
  • Abstract
    This paper proposes a triangular solve algorithm with variable block size for graphics processing unit (GPU). By using diagonal blocks inversion with recursion, this algorithm works with tunable block size to achieve the best performance. Various methods are shown on how to make use of existing profiling tools to successfully measure and analyze performance of this algorithm. We use some of the most popular CPU and GPU profiling tools for their advantages and overcome their disadvantages with several new techniques to analyze the performance and relationship of different components of applications. With the presented methodologies, insight information is produced which helps to understand and tune the proposed algorithm and considerably improve the performance of the solver itself as well as the application using it.
  • Keywords
    computer graphic equipment; coprocessors; CPU; GPU profiling tools; diagonal blocks inversion; graphics processing unit; hybrid multicore architectures; mixed-tool performance analysis; triangular solve algorithm; tunable block size; variable block size; Graphics processing unit; Instruction sets; Instruments; Kernel; Parallel processing; Syntactics; Timing; GPU; profiling; triangular inversion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing Workshops (ICPPW), 2010 39th International Conference on
  • Conference_Location
    San Diego, CA
  • ISSN
    1530-2016
  • Print_ISBN
    978-1-4244-7918-4
  • Electronic_ISBN
    1530-2016
  • Type

    conf

  • DOI
    10.1109/ICPPW.2010.41
  • Filename
    5599199