Title :
A fast noise resilient anomaly detection using GMM-based collective labelling
Author :
Bigdeli, Elnaz ; Raahemi, Bijan ; Mohammadi, Mahdi ; Matwin, Stan
Author_Institution :
Sch. of Electr. Eng. & Comput. Sci., Univ. of Ottawa, Ottawa, ON, Canada
Abstract :
Anomaly detection algorithms face several challenges including computational complexity and resiliency to noise in input data. In this paper, we propose a fast and noise-resilient cluster-based anomaly detection method using collective labelling approach. In the proposed Collective Probabilistic Anomaly Detection method, first instead of labelling each new sample (as normal or anomaly) individually, the new samples are clustered, then labelled. This collective labelling mitigates the negative impact of noise by relying on group behaviour rather than individual characteristics of incoming samples. Second, since grouping and labelling new samples may be time-consuming, we summarize clusters using Gaussian Mixture Model (GMM). Not only does GMM offer faster processing speed; it also facilitates summarizing clusters with arbitrary shape, and consequently, reducing the memory space requirement. Finally, a modified distance measure, based on Kullback-Liebner method, is proposed to calculate the similarity among clusters represented by GMMs. We evaluate the proposed method on various datasets by measuring its false alarm rate, detection rate and memory requirement. We also add different levels of noise to the input datasets to demonstrate the performance of the proposed collective anomaly detection method in the presence of noise. The experimental results confirm superior performance of the proposed method compared to individually-based labelling techniques in terms of memory usage, detection rate and false alarm rate.
Keywords :
Gaussian processes; computational complexity; distance measurement; mixture models; pattern clustering; probability; GMM-based collective labelling; Gaussian mixture model; Kullback-Liebner method; arbitrary shape clustering; collective probabilistic anomaly detection method; computational complexity; detection rate; distance measurement; false alarm rate; memory requirement; memory usage; noise-resilient cluster-based anomaly detection method; Clustering algorithms; Gaussian mixture model; Noise; Shape; Support vector machines; Training; Anomaly Detection; Arbitrary Shape Clustering; Collective Labeling; Distribution Distance; Gaussian Mixture Model; Kullback-Liebner distance;
Conference_Titel :
Science and Information Conference (SAI), 2015
Conference_Location :
London
DOI :
10.1109/SAI.2015.7237166