• DocumentCode
    1120213
  • Title

    Clustering over Multiple Evolving Streams by Events and Correlations

  • Author

    Yeh, Mi-Yen ; Dai, Bi-Ru ; Chen, Ming-Syan

  • Author_Institution
    Nat. Taiwan Univ., Taipei
  • Volume
    19
  • Issue
    10
  • fYear
    2007
  • Firstpage
    1349
  • Lastpage
    1362
  • Abstract
    In applications of multiple data streams such as stock market trading and sensor network data analysis, the clusters of streams change at different times because of data evolution. The information about evolving cluster is valuable to support corresponding online decisions. In this paper, we present a framework for clustering over multiple evolving streams by correlations and events, which, abbreviated as COMET-CORE, monitors the distribution of clusters over multiple data streams based on their correlation. Instead of directly clustering the multiple data streams periodically, COMET-CORE applies efficient cluster split and merge processes only when significant cluster evolution happens. Accordingly, we devise an event detection mechanism to signal the cluster adjustments. The coming streams are smoothed as sequences of end points by employing piecewise linear approximation. At the time when end points are generated, weighted correlations between streams are updated. End points are good indicators of significant change in streams, and this is a main cause of a cluster evolution event. When an event occurs, through split and merge operations we can report the latest clustering results. As shown in our experimental studies, COMET-CORE can be performed effectively with good clustering quality.
  • Keywords
    approximation theory; data mining; pattern clustering; COMET-CORE; data evolution; data mining; event detection; multiple data streams; piecewise linear approximation; streams clustering; Computerized monitoring; Data analysis; Data mining; Decision making; Event detection; Gene expression; Investments; Piecewise linear approximation; Signal detection; Stock markets; Data mining; data clustering; data streams;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2007.1071
  • Filename
    4302743