• DocumentCode
    2439968
  • Title

    An Entropy-Based Data Summarization Algorithm in Data Stream System

  • Author

    Lin, Ouyang ; Qing-ping, Guo

  • Author_Institution
    Sch. of Inf. Eng., Wuhan Univ. of Technol., Wuhan
  • Volume
    2
  • fYear
    2008
  • fDate
    19-20 Dec. 2008
  • Firstpage
    872
  • Lastpage
    876
  • Abstract
    Recently, there has been much interest in building stream processing applications. In these typical applications, also named data stream applications, data are usually unbounded, continuous, huge in amount, fast arriving, time various and bursting. In order to process the input data stream with real time constraints, overloaded data should be dropped. It is a key problem that how to drop the overloaded data. Through predicting the data which will stream into the system, data summarization algorithm can provide heuristic information to the data stream processing system to drop overloaded input data. In this paper, an entropy-based data summarization algorithm (EBDS) is presented. EBDS is designed to produce samples that are "close" to the whole data. By calculating the entropy of the data in the jumping window, it can get a high predictive accuracy. The experiments indicate that the entropy-based data summarization algorithm has a high predictive accuracy.
  • Keywords
    data handling; entropy; data stream system; entropy-based data summarization; input data stream; real time constraint; stream processing application; Accuracy; Aggregates; Conferences; Data processing; Degradation; Entropy; Frequency; Sampling methods; Time factors; Wavelet transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Industrial Application, 2008. PACIIA '08. Pacific-Asia Workshop on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-0-7695-3490-9
  • Type

    conf

  • DOI
    10.1109/PACIIA.2008.132
  • Filename
    4756900