DocumentCode :
3588930
Title :
CASITA: A Tool for Identifying Critical Optimization Targets in Distributed Heterogeneous Applications
Author :
Schmitt, Felix ; Stolle, Jonas ; Dietrich, Robert
Author_Institution :
Center for Inf. Services & High Performance Comput. (ZIH), Tech. Univ. Dresden, Dresden, Germany
fYear :
2014
Firstpage :
186
Lastpage :
195
Abstract :
Programming of high performance computing systems has become more complex over time. Several layers of parallelism need to be exploited to efficiently utilize the available resources. To support application developers and performance analysts we propose a technique for identifying the most performance critical optimization targets in distributed heterogeneous applications. We have developed CASITA, a tool which uses an execution trace and the knowledge about the programming models MPI, OpenMP and CUDA as well as their hierarchy among each other to build a distributed event dependency graph. After locating wait states in this graph, we detect their root cause and compute the critical path, an important property for performance optimizations. Compared to existing analysis approaches, we incorporate the hierarchy of multiple programming models and derive a metric from both the time an activity spends on the critical path and the waiting time it caused. For the purpose of visualization, CASITA enriches the input trace with additional counter information so that results can be inspected in the Vampir trace viewer.
Keywords :
application program interfaces; graph theory; message passing; parallel architectures; parallel programming; CASITA tool; CUDA programming model; MPI programming model; OpenMP programming model; Vampir trace viewer; activity metric; counter information; critical optimization target identification; critical path; distributed event dependency graph; distributed heterogeneous applications; execution trace; high-performance computing system programming; multiple programming model hierarchy; parallelism layers; performance optimizations; root cause detection; time metric; wait states; Analytical models; Computational modeling; Graphics processing units; Kernel; Optimization; Programming; Synchronization; CUDA; MPI; OpenMP; critical path analysis; performance analysis; performance optimization; wait states;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing Workshops (ICCPW), 2014 43rd International Conference on
ISSN :
1530-2016
Type :
conf
DOI :
10.1109/ICPPW.2014.35
Filename :
7103453
Link To Document :
بازگشت