• DocumentCode
    2731346
  • Title

    Space Efficient Streaming Algorithms for the Maximum Error Histogram

  • Author

    Buragohain, C. ; Shrivastava, Nitisha ; Suri, Sean

  • Author_Institution
    Amazon.com, Seattle, WA, USA
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Firstpage
    1026
  • Lastpage
    1035
  • Abstract
    We propose new algorithms for constructing maximum error (L) histograms in the data stream model. Our first algorithm (Min-Merge) achieves the following performance guarantee: using O(B) memory, it constructs a 2B-bucket histogram whose approximation error is at most the error of the optimal B-bucket histogram. Our second algorithm (Min-Increment) achieves a (1 + ε)-approximation of a B-bucket histogram using O(ε-1 B log U) space, where U is the size of the domain for data values. The memory requirements of these algorithms are a significant improvement over the previous best schemes for constructing near-optimal histograms in the data stream model, making them ideal for data summary applications where memory is at a premium, such as wireless sensor networks. Our Min-Increment algorithm also extends to the sliding window model without any asymptotic increase in space. Finally, using synthetic and real-world data, we show that our algorithms are indeed as space-efficient in practice as their theoretical analysis predicts - compared to previous best algorithms, they require two or more orders of magnitude less memory for the same approximation error.
  • Keywords
    approximation theory; computational complexity; query processing; 2B-bucket histogram; Min-Increment algorithm; Min-Merge achieves; approximation error; data stream model; maximum error histograms; optimal B-bucket histogram; space efficient streaming algorithms; Algorithm design and analysis; Approximation algorithms; Approximation error; Computer errors; Computer networks; Histograms; Monitoring; Piecewise linear approximation; Piecewise linear techniques; Wireless sensor networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
  • Conference_Location
    Istanbul
  • Print_ISBN
    1-4244-0802-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2007.368961
  • Filename
    4221751