DocumentCode
415777
Title
Real-time problem determination in distributed systems using active probing
Author
Rish, Irina ; Brodie, Mark ; Odintsova, Natalia ; Ma, Sheng ; Grabarnik, Genady
Author_Institution
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
Volume
1
fYear
2004
fDate
23-23 April 2004
Firstpage
133
Abstract
We describe algorithms and an architecture for a real-time problem determination system that uses online selection of most-informative measurements - the approach called herein active probing. Probes are end-to-end test transactions which gather information about system components. Active probing allows probes to be selected and sent on-demand, in response to one´s belief about the state of the system. At each step the most informative next probe is computed and sent. As probe results are received, belief about the system state is updated using probabilistic inference. This process continues until the problem is diagnosed. We demonstrate through both analysis and simulation that the active probing scheme greatly reduces both the number of probes and the time needed for localizing the problem when compared with non-active probing schemes.
Keywords
computer network management; computer network reliability; inference mechanisms; monitoring; real-time systems; uncertainty handling; active probing; distributed systems; end-to-end test transactions; probabilistic inference; real-time monitoring; real-time problem determination; self-managing networks; Analytical models; Artificial intelligence; Computational modeling; Drives; Information theory; Monitoring; Probes; Real time systems; System testing; Time measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Network Operations and Management Symposium, 2004. NOMS 2004. IEEE/IFIP
Conference_Location
Seoul, South Korea
ISSN
1542-1201
Print_ISBN
0-7803-8230-7
Type
conf
DOI
10.1109/NOMS.2004.1317650
Filename
1317650
Link To Document