DocumentCode :
717078
Title :
Fault detection for cloud computing systems with correlation analysis
Author :
Tao Wang ; Wenbo Zhang ; Jun Wei ; Hua Zhong
Author_Institution :
Inst. of Software, Beijing, China
fYear :
2015
fDate :
11-15 May 2015
Firstpage :
652
Lastpage :
658
Abstract :
The large-scale dynamic cloud computing environment has raised great challenges for fault diagnosis in Web applications. First, fluctuating workloads cause traditional application models to change over time. Moreover, modeling the behaviors of complex applications always requires domain knowledge which is difficult to obtain. Finally, managing large-scale applications manually is impractical for operators. This paper addresses these issues and proposes an automatic fault diagnosis method for Web applications in cloud computing. We propose an online incremental clustering method to recognize access behavior patterns, and uses CCA to model the correlation between workloads and the metrics of application performance/resource utilization in a specific access behavior pattern. Our method detects anomalies by discovering the abrupt change of correlation coefficients with a EWMA control chart, and then locates suspicious metrics using a feature selection method combining ReliefF and SVM-RFE. We validate our method by injecting typical faults in TPC-W an industry-standard benchmark, and the experimental results demonstrate that it can effectively detect typical faults.
Keywords :
cloud computing; fault diagnosis; feature selection; pattern clustering; resource allocation; security of data; CCA; EWMA control chart; ReliefF; SVM-RFE; TPC-W; Web applications; access behavior patterns; anomaly detection; automatic fault diagnosis method; correlation analysis; fault detection; feature selection method; fluctuating workloads; large-scale dynamic cloud computing environment; online incremental clustering method; resource utilization; Correlation; Correlation coefficient; Fault diagnosis; Measurement; Monitoring; Resource management; Servers; Cloud Computing; Fault Detection; Performance Anomaly; Software Monitoring;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Integrated Network Management (IM), 2015 IFIP/IEEE International Symposium on
Conference_Location :
Ottawa, ON
Type :
conf
DOI :
10.1109/INM.2015.7140351
Filename :
7140351
Link To Document :
بازگشت