DocumentCode :
3123037
Title :
Online Anomaly Prediction for Robust Cluster Systems
Author :
Gu, Xiaohui ; Wang, Haixun
Author_Institution :
North Carolina State Univ., Raleigh, NC
fYear :
2009
fDate :
March 29 2009-April 2 2009
Firstpage :
1000
Lastpage :
1011
Abstract :
In this paper, we present a stream-based mining algorithm for online anomaly prediction. Many real-world applications such as data stream analysis requires continuous cluster operation. Unfortunately, today´s large-scale cluster systems are still vulnerable to various software and hardware problems. System administrators are often overwhelmed by the tasks of correcting various system anomalies such as processing bottlenecks (i.e., full stream buffers), resource hot spots, and service level objective (SLO) violations. Our anomaly prediction scheme raises early alerts for impending system anomalies and suggests possible anomaly causes. Specifically, we employ Bayesian classification methods to capture different anomaly symptoms and infer anomaly causes. Markov models are introduced to capture the changing patterns of different measurement metrics. More importantly, our scheme combines Markov models and Bayesian classification methods to predict when a system anomaly will appear in the foreseeable future and what are the possible anomaly causes. To the best of our knowledge, our work provides the first stream-based mining algorithm for predicting system anomalies. We have implemented our approach within the IBM System S distributed stream processing cluster, and conducted case study experiments using fully implemented distributed data analysis applications processing real application workloads. Our experiments show that our approach efficiently predicts and diagnoses several bottleneck anomalies with high accuracy while imposing low overhead to the cluster system.
Keywords :
Bayes methods; Markov processes; data analysis; distributed processing; pattern classification; security of data; Bayesian classification methods; Markov models; data stream analysis; online anomaly prediction; resource hot spots; robust cluster systems; service level objective violations; stream-based mining algorithm; the IBM System S distributed stream processing cluster; Application software; Bayesian methods; Clustering algorithms; Continuous time systems; Data analysis; Data engineering; Hardware; Large-scale systems; Predictive models; Robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
ISSN :
1084-4627
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
Type :
conf
DOI :
10.1109/ICDE.2009.128
Filename :
4812472
Link To Document :
بازگشت