Title :
Using reinforcement learning for pro-active network fault management
Author :
He, Qiming ; Shayman, Mark A.
Author_Institution :
Dept. of Electr. & Comput. Eng., Maryland Univ., College Park, MD, USA
Abstract :
For high-speed networks, it is important that fault management be pro-active i.e., detect, diagnose, and mitigate problems before they result in severe degradation of network performance. Pro-active fault management depends on monitoring the network to obtain the data on which to base manager decisions. However, monitoring introduces additional overhead that may itself degrade network performance, especially when the network is in a stressed state. Thus, a trade-off must be made between the amount of data collected and transferred on one hand, and the speed and accuracy of fault detection and diagnosis on the other hand. Such a trade-off can be naturally formulated as a partially observable Markov decision process (POMDP) whose solution can be used to construct a decision-rule for both centralized and distributed intelligent agents. Since the exact solution of POMDPs for a realistic number of states is computationally prohibitive, we develop a reinforcement-learning-based fast algorithm which learns the decision-rule in an approximate network simulator and makes it fast deployable to the real network. Simulation results are given to diagnose a switch fault in an ATM network
Keywords :
Markov processes; computerised monitoring; decision theory; fault diagnosis; learning (artificial intelligence); telecommunication computing; telecommunication network management; ATM network; data collection; data transfer; decision-rule; fault detection; fault diagnosis; high-speed networks; intelligent agents; manager decisions; network monitoring; network performance; partially observable Markov decision process; proactive network fault management; reinforcement learning; switch fault; Computational modeling; Computer networks; Degradation; Fault detection; Fault diagnosis; High-speed networks; Intelligent agent; Learning; Monitoring; Switches;
Conference_Titel :
Communication Technology Proceedings, 2000. WCC - ICCT 2000. International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7803-6394-9
DOI :
10.1109/ICCT.2000.889257