• DocumentCode
    2384387
  • Title

    Online Critical Path Profiling for Parallel Applications

  • Author

    Zhu, Wenbin ; Bridges, Patrick G. ; Maccabe, Arthur B.

  • Author_Institution
    Dept. of Comput. Sci., New Mexico Univ., Albuquerque, NM
  • fYear
    2005
  • fDate
    Sept. 2005
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    Online monitoring of parallel applications is increasingly important for techniques such as load balancing, protocol adaptation, and online anomaly detection. Unfortunately, existing online monitoring techniques only monitor individual hosts in a distributed-memory parallel application. In this paper, we show how a new monitoring technique, message-centric monitoring, can be used for online monitoring of the complete critical path in distributed-memory parallel applications. Results from an MPI-based message-centric monitoring prototype called IMPuLSE show that it has less than 3% runtime overhead, accurately measures whole-system performance as the application runs, and captures data that can be used by nodes to detect unusual system behaviors at runtime
  • Keywords
    message passing; parallel processing; system monitoring; IMPuLSE monitoring; MPI; distributed-memory parallel application; load balancing; message-centric monitoring; online anomaly detection; online critical path profiling; online monitoring; parallel applications; protocol adaptation; Application software; Bridges; Computer science; Computerized monitoring; Load management; Protocols; Prototypes; Runtime; Statistical distributions; Subcontracting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing, 2005. IEEE International
  • Conference_Location
    Burlington, MA
  • ISSN
    1552-5244
  • Print_ISBN
    0-7803-9486-0
  • Electronic_ISBN
    1552-5244
  • Type

    conf

  • DOI
    10.1109/CLUSTR.2005.347048
  • Filename
    4154091