• DocumentCode
    1337247
  • Title

    Large-scale fault isolation

  • Author

    Reddy, Anoop ; Estrin, Deborah ; Govindan, Ramesh

  • Author_Institution
    Inf. Sci. Inst., Univ. of Southern California, Marina del Rey, CA, USA
  • Volume
    18
  • Issue
    5
  • fYear
    2000
  • fDate
    5/1/2000 12:00:00 AM
  • Firstpage
    733
  • Lastpage
    743
  • Abstract
    Of the many distributed applications designed for the Internet, the successful ones are those that have paid careful attention to scale and robustness. These applications share several design principles. In this paper, we illustrate the application of these principles to common network monitoring tasks. Specifically, we describe and evaluate 1) a robust distributed topology discovery mechanism and 2) a mechanism for scalable fault isolation in multicast distribution trees. Our mechanisms reveal a different design methodology for network monitoring-one that carefully trades off monitoring fidelity (where necessary) for more graceful degradation in the presence of different kinds of network dynamics.
  • Keywords
    Internet; computer network management; computer network reliability; Internet; degradation; design; distributed applications; large-scale fault isolation; monitoring fidelity; multicast distribution tree; network monitoring; network monitoring tasks; robust distributed topology discovery mechanism; scalable fault isolation; Delay; Design methodology; IP networks; Large-scale systems; Monitoring; Network topology; Robustness; Routing; Videoconference; Web and internet services;
  • fLanguage
    English
  • Journal_Title
    Selected Areas in Communications, IEEE Journal on
  • Publisher
    ieee
  • ISSN
    0733-8716
  • Type

    jour

  • DOI
    10.1109/49.842989
  • Filename
    842989