Title :
Robust Stratified Sampling Plans for Low Selectivity Queries
Author :
Joshi, Shantanu ; Jermaine, Christopher
Author_Institution :
Oracle Corp., Redwood Shores, CA
Abstract :
We consider the problem of estimating the result of an aggregate query with a very low selectivity. Traditional sampling techniques can be ineffective for such a problem since a small random sample is likely to miss most or even all of the records satisfying the restrictive selection predicate. Stratfied sampling is useful in this situation, but a key problem in applying stratified sampling effectively is identifying which strata are important and developing a sampling plan that favors those strata in a robust fashion. We develop a solution to this problem that combines any prior knowledge or expectation about the stratification with information obtained from pilot sampling in a principled Bayesian framework.
Keywords :
Bayes methods; query processing; sampling methods; Bayesian framework; low selectivity aggregate query; stratified sampling plan; Aggregates; Bayesian methods; Databases; Marketing and sales; Motion pictures; Query processing; Robustness; Sampling methods; Statistics;
Conference_Titel :
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-1836-7
Electronic_ISBN :
978-1-4244-1837-4
DOI :
10.1109/ICDE.2008.4497428