DocumentCode :
343445
Title :
Fast approximate answers to aggregate queries on a data cube
Author :
Poosala, Viswanath ; Ganti, Venkatesh
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
fYear :
1999
fDate :
36373
Firstpage :
24
Lastpage :
33
Abstract :
Modern decision support systems require very quick (interactive) responses from the DBMS, but pose complex queries on large volumes of data. In this paper, we present a novel solution to this problem: we precompute concise histogram statistics on the data to answer the queries quickly but approximately. Our hypothesis is that many decision support applications can tolerate small errors in query results in return for large reductions in response times. In particular, we propose the use of multiple histograms to approximate the data cube and answer aggregate queries approximately using this summarized data. We enhance histograms to estimate the quality of the approximate answers. We primarily explore the interaction among various histograms on the data cube in order to minimize the space needed when an upper bound on the errors is given. Our main contribution in this paper is an efficient technique for selecting a provably near-optimal set of histograms on the data cube. Extensive experiments show that our technique results in very accurate and concise statistics. Our technique is general in nature and can also be used for selecting a set of histograms (or other statistics) on a relation for the purpose of selectivity estimation
Keywords :
data analysis; data mining; data structures; decision support systems; query processing; statistical databases; DBMS response time; aggregate queries; approximate answer quality estimation; complex queries; concise histogram statistics; data cube; decision support systems; fast approximate answers; multiple histograms; provably near-optimal set; query result error tolerance; selectivity estimation; summarized data; upper error bound; Aggregates; Data analysis; Databases; Delay; Histograms; Information analysis; Reactive power; Read only memory; Statistics; Upper bound;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Scientific and Statistical Database Management, 1999. Eleventh International Conference on
Conference_Location :
Cleveland, OH
Print_ISBN :
0-7695-0046-3
Type :
conf
DOI :
10.1109/SSDM.1999.787618
Filename :
787618
Link To Document :
بازگشت