• DocumentCode
    1890016
  • Title

    Coping with training contamination in unsupervised distributional anomaly detection

  • Author

    Borges, Nash ; Meyer, Gerard G L

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MD
  • fYear
    2009
  • fDate
    18-20 March 2009
  • Firstpage
    264
  • Lastpage
    269
  • Abstract
    In previous work, we presented several distributional approaches to anomaly detection for a speech activity detector by training a model on purely nominal data and estimating the divergence between it and other input. Here, we reformulate the problem in an unsupervised framework and allow for anomalous contamination of the training data. After noting the instability of Gaussian mixture models (GMMs) in this context, we focus on non-parametric methods using regularly binned histograms. While the performance of the log likelihood baseline suffered as the amount of contamination was increased, many of the distributional approaches were not affected. We found that the L1 distance, chi2 statistic, and information theory divergences consistently outperformed the other methods for a variety of contamination levels and test segment lengths.
  • Keywords
    Gaussian processes; learning (artificial intelligence); signal classification; speech recognition; statistical analysis; Gaussian mixture model; anomalous training data contamination; binned histogram; log likelihood baseline; speech activity detector; training classifier; unsupervised distributional anomaly detection; Contamination; Context modeling; Detectors; Histograms; Information theory; Speech; Statistical analysis; Statistical distributions; Testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Sciences and Systems, 2009. CISS 2009. 43rd Annual Conference on
  • Conference_Location
    Baltimore, MD
  • Print_ISBN
    978-1-4244-2733-8
  • Electronic_ISBN
    978-1-4244-2734-5
  • Type

    conf

  • DOI
    10.1109/CISS.2009.5054728
  • Filename
    5054728