DocumentCode
2384387
Title
Online Critical Path Profiling for Parallel Applications
Author
Zhu, Wenbin ; Bridges, Patrick G. ; Maccabe, Arthur B.
Author_Institution
Dept. of Comput. Sci., New Mexico Univ., Albuquerque, NM
fYear
2005
fDate
Sept. 2005
Firstpage
1
Lastpage
9
Abstract
Online monitoring of parallel applications is increasingly important for techniques such as load balancing, protocol adaptation, and online anomaly detection. Unfortunately, existing online monitoring techniques only monitor individual hosts in a distributed-memory parallel application. In this paper, we show how a new monitoring technique, message-centric monitoring, can be used for online monitoring of the complete critical path in distributed-memory parallel applications. Results from an MPI-based message-centric monitoring prototype called IMPuLSE show that it has less than 3% runtime overhead, accurately measures whole-system performance as the application runs, and captures data that can be used by nodes to detect unusual system behaviors at runtime
Keywords
message passing; parallel processing; system monitoring; IMPuLSE monitoring; MPI; distributed-memory parallel application; load balancing; message-centric monitoring; online anomaly detection; online critical path profiling; online monitoring; parallel applications; protocol adaptation; Application software; Bridges; Computer science; Computerized monitoring; Load management; Protocols; Prototypes; Runtime; Statistical distributions; Subcontracting;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing, 2005. IEEE International
Conference_Location
Burlington, MA
ISSN
1552-5244
Print_ISBN
0-7803-9486-0
Electronic_ISBN
1552-5244
Type
conf
DOI
10.1109/CLUSTR.2005.347048
Filename
4154091
Link To Document