Title :
Bi-level Locality Sensitive Hashing for k-Nearest Neighbor Computation
Author :
Pan, Jia ; Manocha, Dinesh
Abstract :
We present a new Bi-level LSH algorithm to perform approximate k-nearest neighbor search in high dimensional spaces. Our formulation is based on a two-level scheme. In the first level, we use a RP-tree that divides the dataset into sub-groups with bounded aspect ratios and is used to distinguish well-separated clusters. During the second level, we compute a single LSH hash table for each sub-group along with a hierarchical structure based on space-filling curves. Given a query, we first determine the sub-group that it belongs to and perform k-nearest neighbor search within the suitable buckets in the LSH hash table corresponding to the sub-group. Our algorithm also maps well to current GPU architectures and can improve the quality of approximate KNN queries as compared to prior LSH-based algorithms. We highlight its performance on two large, high-dimensional image datasets. Given a runtime budget, Bi-level LSH can provide better accuracy in terms of recall or error ration. Moreover, our formulation reduces the variation in runtime cost or the quality of results.
Keywords :
graphics processing units; query processing; trees (mathematics); GPU architecture; RP-tree; approximate k-nearest neighbor search; bilevel locality sensitive hashing; hierarchical structure; high dimensional spaces; high-dimensional image datasets; k-nearest neighbor computation; space-filling curves; two-level scheme; Approximation algorithms; Complexity theory; Lattices; Partitioning algorithms; Probes; Runtime; Shape;
Conference_Titel :
Data Engineering (ICDE), 2012 IEEE 28th International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
978-1-4673-0042-1
DOI :
10.1109/ICDE.2012.40