مرکز منطقه ای اطلاع رساني علوم و فناوري - Efficient Top-k Retrieval on Massive Data

DocumentCode :

740518

Title :

Efficient Top-k Retrieval on Massive Data

Author :

Han, Xixian ; Li, Jianzhong ; Gao, Hong

Author_Institution :

School of Computer Science and Technology, Harbin Institute of Technology, China

Volume :

Issue :

fYear :

2015

Firstpage :

2687

Lastpage :

2699

Abstract :

In many applications, top-k query is an important operation to return a set of interesting points in a potentially huge data space. It is analyzed in this paper that the existing algorithms cannot process top- k query on massive data efficiently. This paper proposes a novel table-scan-based T2S algorithm to efficiently compute top-k results on massive data. T2S first constructs the presorted table, whose tuples are arranged in the order of the round-robin retrieval on the sorted lists. T2S maintains only fixed number of tuples to compute results. The early termination checking for T2S is presented in this paper, along with the analysis of scan depth. The selective retrieval is devised to skip the tuples in the presorted table which are not top-k results. The theoretical analysis proves that selective retrieval can reduce the number of the retrieved tuples significantly. The construction and incremental-update/batch-processing methods for the used structures are proposed in this paper. The extensive experimental results, conducted on synthetic and real-life data sets, show that T2S has a significant advantage over the existing algorithms.

Keywords :

Algorithm design and analysis; Computer science; Concrete; Correlation; Data structures; Indexes; Performance evaluation; Early termination; Massive data; Selective retrieval; T2S algorithm; Table scan; early termination; selective retrieval; table scan;

fLanguage :

English

Journal_Title :

Knowledge and Data Engineering, IEEE Transactions on

Publisher :

ieee

ISSN :

1041-4347

Type :

jour

DOI :

10.1109/TKDE.2015.2426691

Filename :

7095576

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=740518