• DocumentCode
    3089943
  • Title

    Anomaly Detection Algorithms on IBM InfoSphere Streams: Anomaly Detection for Data in Motion

  • Author

    Yulevich, Yifat ; Pyasik, Alex ; Gorelik, Leonid

  • Author_Institution
    Software Lab., Software Solutions Dept., IBM Israel Software Lab., Rehovot, Israel
  • fYear
    2012
  • fDate
    10-13 July 2012
  • Firstpage
    301
  • Lastpage
    308
  • Abstract
    This paper presents and shares excerpts from our implementation of near real-time anomaly detection algorithms on the IBM InfoSphere Streams platform. The purpose of this article is to: 1) Describe how to design and implement known anomaly detection algorithms on IBM InfoSphere Streams. 2) Present some performance optimization capabilities of IBM InfoSphere Streams platform and propose a method to use them in anomaly detection applications. 3) Present some IBM InfoSphere Streams best practices and describe how their adoption in the context of anomaly detection application. The document describes the architecture and design of anomaly detection algorithms developed on IBM InfoSphere Streams. Although the solution was designed to be used for cyber security, the implemented algorithms are agnostic regarding the data type that they monitor and therefore can detect anomalies in data from various industries such as healthcare, finance and retail. The document describes the implementation of two anomaly detection algorithms: KOAD and PCA. The KOAD algorithm performs online anomaly detection with incremental learning and the PCA algorithm in performs offline anomaly detection. The solution was designed to provide near real-time insight into low latency on large data volume observation.
  • Keywords
    learning (artificial intelligence); principal component analysis; security of data; IBM InfoSphere Streams platform; KOAD; PCA; anomaly detection algorithm; cyber security; data volume observation; finance; healthcare; incremental learning; offline anomaly detection; online anomaly detection; performance optimization capability; retail; Algorithm design and analysis; Detection algorithms; Dictionaries; Kernel; Measurement; Principal component analysis; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing with Applications (ISPA), 2012 IEEE 10th International Symposium on
  • Conference_Location
    Leganes
  • Print_ISBN
    978-1-4673-1631-6
  • Type

    conf

  • DOI
    10.1109/ISPA.2012.145
  • Filename
    6280306