DocumentCode :
1284366
Title :
Cost-Aware Rank Join with Random and Sorted Access
Author :
Martinenghi, D. ; Tagliasacchi, M.
Author_Institution :
Dipt. di Elettron. e Inf., Politec. di Milano, Milan, Italy
Volume :
24
Issue :
12
fYear :
2012
Firstpage :
2143
Lastpage :
2155
Abstract :
In this paper, we address the problem of joining ranked results produced by two or more services on the web. We consider services endowed with two kinds of access that are often available: 1) sorted access, which returns tuples sorted by score; 2) random access, which returns tuples matching a given join attribute value. Rank join operators combine objects of two or more relations and output the k combinations with the highest aggregate score. While the past literature has studied suitable bounding schemes for this setting, in this paper we focus on the definition of a pulling strategy, which determines the order of invocation of the joined services. We propose the Cost-Aware with Random and Sorted access (CARS) pulling strategy, which is derived at compile-time and is oblivious of the query-dependent score distributions. We cast CARS as the solution of an optimization problem based on a small set of parameters characterizing the joined services. We validate the proposed strategy with experiments on both real and synthetic data sets. We show that CARS outperforms prior proposals and that its overall access cost is always within a very short margin from that of an oracle-based optimal strategy. In addition, CARS is shown to be robust w.r.t. the uncertainty that may characterize the estimated parameters.
Keywords :
Web services; query processing; search engines; CARS pulling strategy; Web services; bounding schemes; cost-aware rank join; cost-aware with random and sorted access pulling strategy; join attribute value; optimization problem; oracle-based optimal strategy; parameter estimation; pulling strategy; query-dependent score distributions; rank join operators; real data sets; synthetic data sets; Context awareness; Information retrieval; Optimization; Relational databases; Search engines; Upper bound; Top-k; random access; rank join; sorted access;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2011.161
Filename :
5963672
Link To Document :
بازگشت