DocumentCode
1337247
Title
Large-scale fault isolation
Author
Reddy, Anoop ; Estrin, Deborah ; Govindan, Ramesh
Author_Institution
Inf. Sci. Inst., Univ. of Southern California, Marina del Rey, CA, USA
Volume
18
Issue
5
fYear
2000
fDate
5/1/2000 12:00:00 AM
Firstpage
733
Lastpage
743
Abstract
Of the many distributed applications designed for the Internet, the successful ones are those that have paid careful attention to scale and robustness. These applications share several design principles. In this paper, we illustrate the application of these principles to common network monitoring tasks. Specifically, we describe and evaluate 1) a robust distributed topology discovery mechanism and 2) a mechanism for scalable fault isolation in multicast distribution trees. Our mechanisms reveal a different design methodology for network monitoring-one that carefully trades off monitoring fidelity (where necessary) for more graceful degradation in the presence of different kinds of network dynamics.
Keywords
Internet; computer network management; computer network reliability; Internet; degradation; design; distributed applications; large-scale fault isolation; monitoring fidelity; multicast distribution tree; network monitoring; network monitoring tasks; robust distributed topology discovery mechanism; scalable fault isolation; Delay; Design methodology; IP networks; Large-scale systems; Monitoring; Network topology; Robustness; Routing; Videoconference; Web and internet services;
fLanguage
English
Journal_Title
Selected Areas in Communications, IEEE Journal on
Publisher
ieee
ISSN
0733-8716
Type
jour
DOI
10.1109/49.842989
Filename
842989
Link To Document