• DocumentCode
    888473
  • Title

    A framework for on-demand classification of evolving data streams

  • Author

    Aggarwal, Charu C. ; Han, Jiawei ; Wang, Jianyong ; Yu, Philip S.

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Hawthorne, NY,USA
  • Volume
    18
  • Issue
    5
  • fYear
    2006
  • fDate
    5/1/2006 12:00:00 AM
  • Firstpage
    577
  • Lastpage
    589
  • Abstract
    Current models of the classification problem do not effectively handle bursts of particular classes coming in at different times. In fact, the current model of the classification problem simply concentrates on methods for one-pass classification modeling of very large data sets. Our model for data stream classification views the data stream classification problem from the point of view of a dynamic approach in which simultaneous training and test streams are used for dynamic classification of data sets. This model reflects real-life situations effectively, since it is desirable to classify test streams in real time over an evolving training and test stream. The aim here is to create a classification system in which the training model can adapt quickly to the changes of the underlying data stream. In order to achieve this goal, we propose an on-demand classification process which can dynamically select the appropriate window of past training data to build the classifier. The empirical results indicate that the system maintains an high classification accuracy in an evolving data stream, while providing an efficient solution to the classification task.
  • Keywords
    data mining; pattern classification; unsupervised learning; very large databases; data stream classification; on-demand classification; training model; unsupervised learning; very large data set; Data mining; History; Memory; Nearest neighbor searches; Performance evaluation; Testing; Training data; Stream classification; geometric time frame; microclustering; nearest neighbor.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2006.69
  • Filename
    1613862