• DocumentCode
    3060001
  • Title

    Network fault management based on SNMP agent groups

  • Author

    Duarte, Elias Procópio, Jr. ; Santos, Aldri L dos

  • Author_Institution
    Dept. of Inf., Fed. Univ. of Parana, Curitiba, Brazil
  • fYear
    2001
  • fDate
    36982
  • Firstpage
    51
  • Lastpage
    56
  • Abstract
    A network management system must be fault-tolerant in order to provide the required fault management functionality. It is often useful to examine MIB objects of a faulty agent in order to determine why it is faulty. This paper presents a new framework for replicating of SNMP management objects in local area networks. The framework is based on groups of agents that communicate with each other using reliable multicast. A group of agents provides fault-tolerant object functionality. A SNMP service is proposed that allows replicated MIB objects of a faulty agent of a given group to be accessed through fault-free agents of that group. The presented framework allows the dynamic definition of agent groups, and management objects to be replicated in each group. A practical fault-tolerant tool for local area network fault management was implemented and is presented. The system employs SNMP agents that interact with a group communication tool. As an example, we show how the examination of TCP-related objects of faulty agents have been used in the fault diagnosis process. The impact of replication on network performance is evaluated as well as a probabilistic analysis of replicated object consistency
  • Keywords
    computer network management; computer network reliability; distributed object management; local area networks; multi-agent systems; performance evaluation; transport protocols; MIB objects; SNMP agent groups; TCP; agent communication; computer network fault management; computer network management system; fault diagnosis; fault-free agents; group communication tool; local area networks; network fault-tolerance; network performance; probabilistic analysis; reliable multicast; replicated object consistency; Computer crashes; Electronic mail; Fault detection; Fault diagnosis; Fault tolerance; Fault tolerant systems; Informatics; Local area networks; Monitoring; Protocols;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems Workshop, 2001 International Conference on
  • Conference_Location
    Mesa, AZ
  • Print_ISBN
    0-7695-1080-9
  • Type

    conf

  • DOI
    10.1109/CDCS.2001.918686
  • Filename
    918686