DocumentCode :
2405703
Title :
Wide area cluster monitoring with Ganglia
Author :
Sacerdoti, Federico D. ; Katz, Mason J. ; Massie, Matthew L. ; Culler, David E.
Author_Institution :
San Diego Supercomput. Center, CA, USA
fYear :
2003
fDate :
1-4 Dec. 2003
Firstpage :
289
Lastpage :
298
Abstract :
In this paper, we present a structure for monitoring a large set of computational clusters. We illustrate methods for scaling a monitor network comprised of many clusters while keeping processing requirements low. A design for presenting high-level Web-based summaries of the monitor network is provided, along with a generalization to a distributed, multiple-resolution monitoring tree. Emphasis is placed on scalability, fast query response, fault tolerance, and grid compatibility. Experimental evidence is presented that demonstrates the performance of our design.
Keywords :
Internet; fault tolerant computing; monitoring; performance evaluation; workstation clusters; Ganglia; Web-based summaries; computational clusters; distributed monitoring tree; fault tolerance; grid compatibility; monitor network; multiple-resolution monitoring tree; query response; wide area cluster monitoring; Computer fault tolerance; Condition monitoring; Displays; Fault tolerance; Feedback; Internet; Monitoring; Production; Remote monitoring; Scalability; Statistics; Tree data structures; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 2003. Proceedings. 2003 IEEE International Conference on
Print_ISBN :
0-7695-2066-9
Type :
conf
DOI :
10.1109/CLUSTR.2003.1253327
Filename :
1253327
Link To Document :
بازگشت