DocumentCode
2536282
Title
Mixed-Tool Performance Analysis on Hybrid Multicore Architectures
Author
Du, Peng ; Luszczek, Piotr ; Tomov, Stanimire ; Dongarra, Jack
fYear
2010
fDate
13-16 Sept. 2010
Firstpage
236
Lastpage
244
Abstract
This paper proposes a triangular solve algorithm with variable block size for graphics processing unit (GPU). By using diagonal blocks inversion with recursion, this algorithm works with tunable block size to achieve the best performance. Various methods are shown on how to make use of existing profiling tools to successfully measure and analyze performance of this algorithm. We use some of the most popular CPU and GPU profiling tools for their advantages and overcome their disadvantages with several new techniques to analyze the performance and relationship of different components of applications. With the presented methodologies, insight information is produced which helps to understand and tune the proposed algorithm and considerably improve the performance of the solver itself as well as the application using it.
Keywords
computer graphic equipment; coprocessors; CPU; GPU profiling tools; diagonal blocks inversion; graphics processing unit; hybrid multicore architectures; mixed-tool performance analysis; triangular solve algorithm; tunable block size; variable block size; Graphics processing unit; Instruction sets; Instruments; Kernel; Parallel processing; Syntactics; Timing; GPU; profiling; triangular inversion;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing Workshops (ICPPW), 2010 39th International Conference on
Conference_Location
San Diego, CA
ISSN
1530-2016
Print_ISBN
978-1-4244-7918-4
Electronic_ISBN
1530-2016
Type
conf
DOI
10.1109/ICPPW.2010.41
Filename
5599199
Link To Document