• DocumentCode
    1558086
  • Title

    Optimal Parameters for Locality-Sensitive Hashing

  • Author

    Slaney, Malcolm ; Lifshits, Yury ; He, Junfeng

  • Author_Institution
    Yahoo! Research, Sunnyvale, CA , USA
  • Volume
    100
  • Issue
    9
  • fYear
    2012
  • Firstpage
    2604
  • Lastpage
    2623
  • Abstract
    Locality-sensitive hashing (LSH) is the basis of many algorithms that use a probabilistic approach to find nearest neighbors. We describe an algorithm for optimizing the parameters and use of LSH. Prior work ignores these issues or suggests a search for the best parameters. We start with two histograms: one that characterizes the distributions of distances to a point\´s nearest neighbors and the second that characterizes the distance between a query and any point in the data set. Given a desired performance level (the chance of finding the true nearest neighbor) and a simple computational cost model, we return the LSH parameters that allow an LSH index to meet the performance goal and have the minimum computational cost. We can also use this analysis to connect LSH to deterministic nearest-neighbor algorithms such as k - d trees and thus start to unify the two approaches.
  • Keywords
    Database systems; Histograms; Indexing; Multimedia communication; Nearest neighbor searches; Optimization; Quantization; Database index; information retrieval; locality-sensitive hashing; multimedia databases; nearest-neighbor search;
  • fLanguage
    English
  • Journal_Title
    Proceedings of the IEEE
  • Publisher
    ieee
  • ISSN
    0018-9219
  • Type

    jour

  • DOI
    10.1109/JPROC.2012.2193849
  • Filename
    6242372