DocumentCode
1558086
Title
Optimal Parameters for Locality-Sensitive Hashing
Author
Slaney, Malcolm ; Lifshits, Yury ; He, Junfeng
Author_Institution
Yahoo! Research, Sunnyvale, CA , USA
Volume
100
Issue
9
fYear
2012
Firstpage
2604
Lastpage
2623
Abstract
Locality-sensitive hashing (LSH) is the basis of many algorithms that use a probabilistic approach to find nearest neighbors. We describe an algorithm for optimizing the parameters and use of LSH. Prior work ignores these issues or suggests a search for the best parameters. We start with two histograms: one that characterizes the distributions of distances to a point\´s nearest neighbors and the second that characterizes the distance between a query and any point in the data set. Given a desired performance level (the chance of finding the true nearest neighbor) and a simple computational cost model, we return the LSH parameters that allow an LSH index to meet the performance goal and have the minimum computational cost. We can also use this analysis to connect LSH to deterministic nearest-neighbor algorithms such as
-
trees and thus start to unify the two approaches.
Keywords
Database systems; Histograms; Indexing; Multimedia communication; Nearest neighbor searches; Optimization; Quantization; Database index; information retrieval; locality-sensitive hashing; multimedia databases; nearest-neighbor search;
fLanguage
English
Journal_Title
Proceedings of the IEEE
Publisher
ieee
ISSN
0018-9219
Type
jour
DOI
10.1109/JPROC.2012.2193849
Filename
6242372
Link To Document