DocumentCode :
1791635
Title :
Lightweight approximate top-k for distributed settings
Author :
Deolalikar, Vinay ; Eshghi, Kave
Author_Institution :
Hewlett Packard Res., Sunnyvale, CA, USA
fYear :
2014
fDate :
27-30 Oct. 2014
Firstpage :
835
Lastpage :
844
Abstract :
Consider the problem of finding the Top-k records in a relation based on the sum of their attributes. This problem occurs in various settings in big data management, for example in geographically distributed data centers and clouds, both at the application layer and the storage management layer. We propose a lightweight distributed, order and duplication insensitive approach based on order statistics. The salient feature of our algorithm that makes it extremely lightweight is that it only processes and communicates the items most likely to be in the Top-k. We validate the efficacy of our algorithm on a wide range of datasets.
Keywords :
Big Data; computer centres; distributed databases; statistical analysis; storage management; big data management; geographically distributed data centers; lightweight approximation; storage management layer; Approximation algorithms; Big data; Distributed databases; Exponential distribution; Google; Merging; Random variables;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/BigData.2014.7004313
Filename :
7004313
Link To Document :
بازگشت