DocumentCode
2693060
Title
Online detection of utility cloud anomalies using metric distributions
Author
Wang, Chengwei ; Talwar, Vanish ; Schwan, Karsten ; Ranganathan, Parthasarathy
Author_Institution
Center for Exp. Res. in Comput. Syst., Georgia Inst. of Technol., Atlanta, GA, USA
fYear
2010
fDate
19-23 April 2010
Firstpage
96
Lastpage
103
Abstract
The online detection of anomalies is a vital element of operations in data centers and in utility clouds like Amazon EC2. Given ever-increasing data center sizes coupled with the complexities of systems software, applications, and workload patterns, such anomaly detection must operate automatically, at runtime, and without the need for prior knowledge about normal or anomalous behaviors. Further, detection should function for different levels of abstraction like hardware and software, and for the multiple metrics used in cloud computing systems. This paper proposes EbAT - Entropy-based Anomaly Testing - offering novel methods that detect anomalies by analyzing for arbitrary metrics their distributions rather than individual metric thresholds. Entropy is used as a measurement that captures the degree of dispersal or concentration of such distributions, aggregating raw metric data across the cloud stack to form entropy time series. For scalability, such time series can then be combined hierarchically and across multiple cloud subsystems. Experimental results on utility cloud scenarios demonstrate the viability of the approach. EbAT outperforms threshold-based methods with on average 57.4% improvement in accuracy of anomaly detection and also does better by 59.3% on average in false alarm rate with a `near-optimum´ threshold-based method.
Keywords
Internet; computer centres; program testing; software metrics; Amazon EC2; EbAT; anomaly detection; data centers; entropy-based anomaly testing; metric distributions; online detection; utility cloud anomalies; Application software; Cloud computing; Dispersion; Entropy; Hardware; Runtime; Scalability; System software; Testing; Time measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Network Operations and Management Symposium (NOMS), 2010 IEEE
Conference_Location
Osaka
ISSN
1542-1201
Print_ISBN
978-1-4244-5366-5
Electronic_ISBN
1542-1201
Type
conf
DOI
10.1109/NOMS.2010.5488443
Filename
5488443
Link To Document