DocumentCode :
3060001
Title :
Network fault management based on SNMP agent groups
Author :
Duarte, Elias Procópio, Jr. ; Santos, Aldri L dos
Author_Institution :
Dept. of Inf., Fed. Univ. of Parana, Curitiba, Brazil
fYear :
2001
fDate :
36982
Firstpage :
51
Lastpage :
56
Abstract :
A network management system must be fault-tolerant in order to provide the required fault management functionality. It is often useful to examine MIB objects of a faulty agent in order to determine why it is faulty. This paper presents a new framework for replicating of SNMP management objects in local area networks. The framework is based on groups of agents that communicate with each other using reliable multicast. A group of agents provides fault-tolerant object functionality. A SNMP service is proposed that allows replicated MIB objects of a faulty agent of a given group to be accessed through fault-free agents of that group. The presented framework allows the dynamic definition of agent groups, and management objects to be replicated in each group. A practical fault-tolerant tool for local area network fault management was implemented and is presented. The system employs SNMP agents that interact with a group communication tool. As an example, we show how the examination of TCP-related objects of faulty agents have been used in the fault diagnosis process. The impact of replication on network performance is evaluated as well as a probabilistic analysis of replicated object consistency
Keywords :
computer network management; computer network reliability; distributed object management; local area networks; multi-agent systems; performance evaluation; transport protocols; MIB objects; SNMP agent groups; TCP; agent communication; computer network fault management; computer network management system; fault diagnosis; fault-free agents; group communication tool; local area networks; network fault-tolerance; network performance; probabilistic analysis; reliable multicast; replicated object consistency; Computer crashes; Electronic mail; Fault detection; Fault diagnosis; Fault tolerance; Fault tolerant systems; Informatics; Local area networks; Monitoring; Protocols;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems Workshop, 2001 International Conference on
Conference_Location :
Mesa, AZ
Print_ISBN :
0-7695-1080-9
Type :
conf
DOI :
10.1109/CDCS.2001.918686
Filename :
918686
Link To Document :
بازگشت