• DocumentCode
    2030521
  • Title

    A fast noise resilient anomaly detection using GMM-based collective labelling

  • Author

    Bigdeli, Elnaz ; Raahemi, Bijan ; Mohammadi, Mahdi ; Matwin, Stan

  • Author_Institution
    Sch. of Electr. Eng. & Comput. Sci., Univ. of Ottawa, Ottawa, ON, Canada
  • fYear
    2015
  • fDate
    28-30 July 2015
  • Firstpage
    337
  • Lastpage
    344
  • Abstract
    Anomaly detection algorithms face several challenges including computational complexity and resiliency to noise in input data. In this paper, we propose a fast and noise-resilient cluster-based anomaly detection method using collective labelling approach. In the proposed Collective Probabilistic Anomaly Detection method, first instead of labelling each new sample (as normal or anomaly) individually, the new samples are clustered, then labelled. This collective labelling mitigates the negative impact of noise by relying on group behaviour rather than individual characteristics of incoming samples. Second, since grouping and labelling new samples may be time-consuming, we summarize clusters using Gaussian Mixture Model (GMM). Not only does GMM offer faster processing speed; it also facilitates summarizing clusters with arbitrary shape, and consequently, reducing the memory space requirement. Finally, a modified distance measure, based on Kullback-Liebner method, is proposed to calculate the similarity among clusters represented by GMMs. We evaluate the proposed method on various datasets by measuring its false alarm rate, detection rate and memory requirement. We also add different levels of noise to the input datasets to demonstrate the performance of the proposed collective anomaly detection method in the presence of noise. The experimental results confirm superior performance of the proposed method compared to individually-based labelling techniques in terms of memory usage, detection rate and false alarm rate.
  • Keywords
    Gaussian processes; computational complexity; distance measurement; mixture models; pattern clustering; probability; GMM-based collective labelling; Gaussian mixture model; Kullback-Liebner method; arbitrary shape clustering; collective probabilistic anomaly detection method; computational complexity; detection rate; distance measurement; false alarm rate; memory requirement; memory usage; noise-resilient cluster-based anomaly detection method; Clustering algorithms; Gaussian mixture model; Noise; Shape; Support vector machines; Training; Anomaly Detection; Arbitrary Shape Clustering; Collective Labeling; Distribution Distance; Gaussian Mixture Model; Kullback-Liebner distance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Science and Information Conference (SAI), 2015
  • Conference_Location
    London
  • Type

    conf

  • DOI
    10.1109/SAI.2015.7237166
  • Filename
    7237166