DocumentCode
3588930
Title
CASITA: A Tool for Identifying Critical Optimization Targets in Distributed Heterogeneous Applications
Author
Schmitt, Felix ; Stolle, Jonas ; Dietrich, Robert
Author_Institution
Center for Inf. Services & High Performance Comput. (ZIH), Tech. Univ. Dresden, Dresden, Germany
fYear
2014
Firstpage
186
Lastpage
195
Abstract
Programming of high performance computing systems has become more complex over time. Several layers of parallelism need to be exploited to efficiently utilize the available resources. To support application developers and performance analysts we propose a technique for identifying the most performance critical optimization targets in distributed heterogeneous applications. We have developed CASITA, a tool which uses an execution trace and the knowledge about the programming models MPI, OpenMP and CUDA as well as their hierarchy among each other to build a distributed event dependency graph. After locating wait states in this graph, we detect their root cause and compute the critical path, an important property for performance optimizations. Compared to existing analysis approaches, we incorporate the hierarchy of multiple programming models and derive a metric from both the time an activity spends on the critical path and the waiting time it caused. For the purpose of visualization, CASITA enriches the input trace with additional counter information so that results can be inspected in the Vampir trace viewer.
Keywords
application program interfaces; graph theory; message passing; parallel architectures; parallel programming; CASITA tool; CUDA programming model; MPI programming model; OpenMP programming model; Vampir trace viewer; activity metric; counter information; critical optimization target identification; critical path; distributed event dependency graph; distributed heterogeneous applications; execution trace; high-performance computing system programming; multiple programming model hierarchy; parallelism layers; performance optimizations; root cause detection; time metric; wait states; Analytical models; Computational modeling; Graphics processing units; Kernel; Optimization; Programming; Synchronization; CUDA; MPI; OpenMP; critical path analysis; performance analysis; performance optimization; wait states;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing Workshops (ICCPW), 2014 43rd International Conference on
ISSN
1530-2016
Type
conf
DOI
10.1109/ICPPW.2014.35
Filename
7103453
Link To Document