DocumentCode :
2568642
Title :
A comparative analysis of event tupling schemes
Author :
Buckley, Michael F. ; Siewiorek, Daniel P.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
1996
fDate :
25-27 Jun 1996
Firstpage :
294
Lastpage :
303
Abstract :
Event logs provide an effective means of improving system availability. However, the majority of faults produce many errors because faults propagate in the time and error detection domains. Thus, the ability to coalesce related events is critical. The tupling heuristics developed at Carnegie-Mellon University provide one such methodology. These heuristics were applied to a new and larger set of data in order to evaluate the generality of the scheme and to extend the previous work. The extensions included deriving a semantic understanding of why the rules work, expanded statistical analysis, and a comprehensive sensitivity study to determine the effects of changes in the rules. The results prove that tupling is a useful and general methodology. The sensitivity study enabled the identification of refinements to the rules, while the high degree of skew in the tuple variables enables us to propose that the extreme percentiles be used as an alarm threshold for proactive fault management
Keywords :
computer debugging; fault tolerant computing; reliability; statistical analysis; system recovery; alarm threshold; event logs; event tupling schemes; expanded statistical analysis; proactive fault management; semantic understanding; system availability; tuple variables; tupling heuristics; Board of Directors; Computer errors; Fault detection; Fault diagnosis; Filtering; Hardware; Robustness; Statistical analysis; Voice mail;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault Tolerant Computing, 1996., Proceedings of Annual Symposium on
Conference_Location :
Sendai
ISSN :
0731-3071
Print_ISBN :
0-8186-7262-5
Type :
conf
DOI :
10.1109/FTCS.1996.534614
Filename :
534614
Link To Document :
بازگشت