DocumentCode
3008604
Title
Embedded Gossip: Lightweight Online Measurement for Large-Scale Applications
Author
Zhu, Wenbin ; Bridges, Patrick G. ; Maccabe, Arthur B.
Author_Institution
University of New Mexico
fYear
2007
fDate
25-27 June 2007
Firstpage
58
Lastpage
58
Abstract
For large-scale parallel applications, lightweight online monitoring can enable a wide range of online adaptations, including load balancing, power management, and progress monitoring. The processing and monitoring overhead of centralized global tracing techniques make them unsuitable for such tasks. Purely local tools, on the other hand, fail to provide the global information necessary for many desirable online adaptations of large-scale applications. In this paper, we describe a novel distributed online measurement method for large-scale applications called Embedded Gossip (EG). EG works by piggybacking performance information about application behavior on existing application messages and merging received information with previously known data in a fashion customized to the needs of a particular monitoring task. EG thus provides each process with both local and global views of application behavior with low overhead. To illustrate the capabilities of Embedded Gossip, we also show that it disseminates global information in a timely fashion for a wide range of monitoring tasks, including critical path profiling, workload imbalance monitoring, and progress monitoring. This global information has a wide range of potential uses, including imbalance detection for load balancing and energy management tools, progress monitoring for batch schedulers, and a wide range of other performance debugging and optimization techniques.
Keywords
Application software; Bridges; Computer architecture; Computer science; Computerized monitoring; Condition monitoring; Energy management; Laboratories; Large-scale systems; Load management;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems, 2007. ICDCS '07. 27th International Conference on
Conference_Location
Toronto, ON, Canada
ISSN
1063-6927
Print_ISBN
0-7695-2837-3
Electronic_ISBN
1063-6927
Type
conf
DOI
10.1109/ICDCS.2007.107
Filename
4268211
Link To Document