• DocumentCode
    2772000
  • Title

    Filtering and Refinement: A Two-Stage Approach for Efficient and Effective Anomaly Detection

  • Author

    Yu, Xiao ; Tang, Lu An ; Han, Jiawei

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
  • fYear
    2009
  • fDate
    6-9 Dec. 2009
  • Firstpage
    617
  • Lastpage
    626
  • Abstract
    Anomaly detection is an important data mining task. Most existing methods treat anomalies as inconsistencies and spend the majority amount of time on modeling normal instances. A recently proposed, sampling-based approach may substantially boost the efficiency in anomaly detection but may also lead to weaker accuracy and robustness. In this study, we propose a two-stage approach to find anomalies in complex datasets with high accuracy as well as low time complexity and space cost. Instead of analyzing normal instances, our algorithm first employs an efficient deterministic space partition algorithm to eliminate obvious normal instances and generates a small set of anomaly candidates with a single scan of the dataset. It then checks each candidate with density-based multiple criteria to determine the final results. This two-stage framework also detects anomalies of different notions. Our experiments show that this new approach finds anomalies successfully in different conditions and ensures a good balance of efficiency, accuracy, and robustness.
  • Keywords
    data mining; security of data; anomaly detection; complex dataset; data mining; density-based multiple criteria; deterministic space partition algorithm; filtering; normal instances; refinement; Algorithm design and analysis; Computer science; Costs; Data mining; Density measurement; Filtering; Intrusion detection; Partitioning algorithms; Robustness; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
  • Conference_Location
    Miami, FL
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-5242-2
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2009.44
  • Filename
    5360288