Title :
Ensemble of Bayesian Predictors for Autonomic Failure Management in Cloud Computing
Author :
Guan, Qiang ; Zhang, Ziming ; Fu, Song
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of North Texas, Denton, TX, USA
fDate :
July 31 2011-Aug. 4 2011
Abstract :
In modern cloud computing systems, hundreds and even thousands of cloud servers are interconnected by multi-layer networks. In such large-scale and complex systems, failures are common. Proactive failure management is a crucial technology to characterize system behaviors and forecast failure dynamics in the cloud. To make failure predictions, we need to monitor the system execution and collect health-related runtime performance data. However, in newly deployed or managed cloud systems, these data are usually unlabeled. Supervised learning based approaches are not suitable in this case. In this paper, we present an unsupervised failure detection method using an ensemble of Bayesian models. It estimates the probability distribution of runtime performance data collected by health monitoring tools when cloud servers perform normally. It characterizes normal execution states of the system and detects anomalous behaviors. Experimental results in an institute-wide cloud computing system show that our methods can achieve high true positive rate and low false positive rate for proactive failure management.
Keywords :
Bayes methods; cloud computing; fault tolerant computing; statistical distributions; unsupervised learning; Bayesian predictor ensemble; autonomic failure management; cloud computing systems; health monitoring tools; multilayer networks; proactive failure management; runtime performance data probability distribution; unsupervised failure detection method; Bayesian methods; Cloud computing; Computational modeling; Data models; Monitoring; Runtime; Servers;
Conference_Titel :
Computer Communications and Networks (ICCCN), 2011 Proceedings of 20th International Conference on
Conference_Location :
Maui, HI
Print_ISBN :
978-1-4577-0637-0
DOI :
10.1109/ICCCN.2011.6006036