DocumentCode
2405039
Title
Improving range query estimation on histograms
Author
Buccafurri, Francesco ; Rosaci, Domenico ; Pontieri, Luigi ; Sacca, Domenico
Author_Institution
DIMET Dept, Univ. of Reggio Calabria, Italy
fYear
2002
fDate
2002
Firstpage
628
Lastpage
638
Abstract
Histograms are used to summarize the contents of relations for the estimation of query result sizes into a number of buckets. Several techniques (e.g., MaxDiff and V-Optimal) have been proposed in the past for determining bucket boundaries which provide better estimations. This paper proposes to use 32 bit information (4-level tree index) for each bucket for storing approximated cumulative frequencies at 7 internal intervals of a bucket. Both theoretical analysis and experimental results show that the 4-level tree index provides the best frequency estimation inside a bucket. The index is later added to two well-known techniques for constructing histograms, MaxDiff and V-Optimal, thus obtaining high improvements in the frequency estimation over inter-bucket ranges w.r.t. the original methods
Keywords
query processing; tree data structures; MaxDiff; V-Optimal; approximated cumulative frequency storage; bucket boundaries; buckets; histograms; internal intervals; query result size estimation; range query estimation; tree index; Character generation; Chromium; Frequency estimation; Histograms; Upper bound;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 2002. Proceedings. 18th International Conference on
Conference_Location
San Jose, CA
ISSN
1063-6382
Print_ISBN
0-7695-1531-2
Type
conf
DOI
10.1109/ICDE.2002.994780
Filename
994780
Link To Document