DocumentCode :
1350713
Title :
Semantics of Ranking Queries for Probabilistic Data
Author :
Jestes, Jeffrey ; Cormode, Graham ; Li, Feifei ; Yi, Ke
Author_Institution :
Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
Volume :
23
Issue :
12
fYear :
2011
Firstpage :
1903
Lastpage :
1917
Abstract :
Recently, there have been several attempts to propose definitions and algorithms for ranking queries on probabilistic data. However, these lack many intuitive properties of a top-k over deterministic data. We define several fundamental properties, including exact-k, containment, unique rank, value invariance, and stability, which are satisfied by ranking queries on certain data. We argue that these properties should also be carefully studied in defining ranking queries in probabilistic data, and fulfilled by definition for ranking uncertain data for most applications. We propose an intuitive new ranking definition based on the observation that the ranks of a tuple across all possible worlds represent a well-founded rank distribution. We studied the ranking definitions based on the expectation, the median, and other statistics of this rank distribution for a tuple and derived the expected rank, median rank, and quantile rank correspondingly. We are able to prove that the expected rank, median rank, and quantile rank satisfy all these properties for a ranking query. We provide efficient solutions to compute such rankings across the major models of uncertain data, such as attribute-level and tuple-level uncertainty. Finally, a comprehensive experimental study confirms the effectiveness of our approach.
Keywords :
database management systems; query processing; attribute-level uncertain data; containment property; exact-k property; expected rank; median rank; probabilistic data; quantile rank; rank distribution; ranking definition; ranking query semantics; stability property; tuple-level uncertain data; unique rank property; value invariance property; Computational modeling; Data models; Database systems; Information retrieval; Probabilistic logic; Semantics; Uncertainty; Probabilistic data; ranking queries; top-k queries; uncertain database.;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2010.192
Filename :
5601720
Link To Document :
بازگشت