• DocumentCode
    2945215
  • Title

    Deplump for Streaming Data

  • Author

    Bartlett, Nicholas ; Wood, Frank

  • Author_Institution
    Dept. of Stat., Columbia Univ., New York, NY, USA
  • fYear
    2011
  • fDate
    29-31 March 2011
  • Firstpage
    363
  • Lastpage
    372
  • Abstract
    We present a general-purpose, loss less compressor for streaming data. This compressor is based on the deplump probabilistic compressor for batch data. Approximations to the inference procedure used in the probabilistic model underpinning deplump are introduced that yield the computational asyptotics necessary for stream compression. We demonstrate the performance of this streaming deplump variant relative to the batch compressor on a benchmark corpus and find that it performs equivalently well despite these approximations. We also explore the performance of the streaming variant on corpora that are too large to be compressed by batch deplump and demonstrate excellent compression performance.
  • Keywords
    data compression; probability; batch compressor; batch data; deplump probabilistic compressor; inference procedure; probabilistic model underpinning deplump; stream compression; streaming data loss less compressor; streaming deplump variant relative; Approximation algorithms; Approximation methods; Complexity theory; Computational modeling; Context; Inference algorithms; Vegetation; Bayesian; Non-parameteric; sequence memoizer;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference (DCC), 2011
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    978-1-61284-279-0
  • Type

    conf

  • DOI
    10.1109/DCC.2011.43
  • Filename
    5749494