DocumentCode
3060001
Title
Network fault management based on SNMP agent groups
Author
Duarte, Elias Procópio, Jr. ; Santos, Aldri L dos
Author_Institution
Dept. of Inf., Fed. Univ. of Parana, Curitiba, Brazil
fYear
2001
fDate
36982
Firstpage
51
Lastpage
56
Abstract
A network management system must be fault-tolerant in order to provide the required fault management functionality. It is often useful to examine MIB objects of a faulty agent in order to determine why it is faulty. This paper presents a new framework for replicating of SNMP management objects in local area networks. The framework is based on groups of agents that communicate with each other using reliable multicast. A group of agents provides fault-tolerant object functionality. A SNMP service is proposed that allows replicated MIB objects of a faulty agent of a given group to be accessed through fault-free agents of that group. The presented framework allows the dynamic definition of agent groups, and management objects to be replicated in each group. A practical fault-tolerant tool for local area network fault management was implemented and is presented. The system employs SNMP agents that interact with a group communication tool. As an example, we show how the examination of TCP-related objects of faulty agents have been used in the fault diagnosis process. The impact of replication on network performance is evaluated as well as a probabilistic analysis of replicated object consistency
Keywords
computer network management; computer network reliability; distributed object management; local area networks; multi-agent systems; performance evaluation; transport protocols; MIB objects; SNMP agent groups; TCP; agent communication; computer network fault management; computer network management system; fault diagnosis; fault-free agents; group communication tool; local area networks; network fault-tolerance; network performance; probabilistic analysis; reliable multicast; replicated object consistency; Computer crashes; Electronic mail; Fault detection; Fault diagnosis; Fault tolerance; Fault tolerant systems; Informatics; Local area networks; Monitoring; Protocols;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems Workshop, 2001 International Conference on
Conference_Location
Mesa, AZ
Print_ISBN
0-7695-1080-9
Type
conf
DOI
10.1109/CDCS.2001.918686
Filename
918686
Link To Document