• DocumentCode
    3703534
  • Title

    Modeling recurrent distributions in streams using possible worlds

  • Author

    Michael Geilke;Andreas Karwath;Stefan Kramer

  • Author_Institution
    Johannes Gutenberg-Universit?t Mainz, Staudingerweg 9, 55128 Mainz, Germany
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    Discovering changes in the data distribution of streams and discovering recurrent data distributions are challenging problems in data mining and machine learning. Both have received a lot of attention in the context of classification. With the ever increasing growth of data, however, there is a high demand of compact and universal representations of data streams that enable the user to analyze current as well as historic data without having access to the raw data. To make a first step towards this direction, we propose a condensed representation that captures the various - possibly recurrent - data distributions of the stream by extending the notion of possible worlds. The representation enables queries concerning the whole stream and can, hence, serve as a tool for supporting decision-making processes or serve as a basis for implementing data mining and machine learning algorithms on top of it. We evaluate this condensed representation on synthetic and real-world data.
  • Keywords
    "Data mining","Context","Machine learning algorithms","Shape","Clocks","Itemsets","Data models"
  • Publisher
    ieee
  • Conference_Titel
    Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
  • Print_ISBN
    978-1-4673-8272-4
  • Type

    conf

  • DOI
    10.1109/DSAA.2015.7344814
  • Filename
    7344814