• DocumentCode
    3305677
  • Title

    Novel algorithms for computing medians and other quantiles of disk-resident data

  • Author

    Fu, Lixin ; Rajasekaran, Sanguthevar

  • Author_Institution
    Dept. of Comput. & Inf. Sci., Florida Univ., Gainesville, FL, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    145
  • Lastpage
    154
  • Abstract
    In data warehousing applications, numerous OLAP queries involve the processing of holistic operations such as computing the “top N”, median, etc. Efficient implementations of these operations are hard to come by. Several algorithms have been proposed in the literature that estimate various quantiles of disk-resident data. Two such recent algorithms are based on sampling. The authors present two novel and efficient quantiling algorithms, Deterministic Bucketing (DB) and Randomized Bucketing (RB). We have analyzed the performance of DB and RE and extended the analysis of the sampling done in prior algorithms. We have conducted extensive experiments to compare all these four algorithms. Our experimental data indicate that our new algorithms outperform prior algorithms not only in the overall run time but also in accuracy. The new algorithms can be used either as one-pass algorithms to accurately estimate quantiles or as algorithms for computing the quantiles exactly
  • Keywords
    data mining; data warehouses; sampling methods; Deterministic Bucketing; OLAP queries; Randomized Bucketing; data warehousing applications; disk-resident data; holistic operations; medians; one-pass algorithms; quantiles; quantiling algorithms; sampling; Algorithm design and analysis; Partitioning algorithms; Performance analysis; Phase change random access memory; Sampling methods; Sorting; Upper bound; Warehousing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database Engineering and Applications, 2001 International Symposium on.
  • Conference_Location
    Grenoble
  • Print_ISBN
    0-7695-1140-6
  • Type

    conf

  • DOI
    10.1109/IDEAS.2001.938081
  • Filename
    938081