• DocumentCode
    3013481
  • Title

    High dimensional similarity search with space filling curves

  • Author

    Liao, Swanwa ; Lopez, Mario A. ; Leutenegger, Scott T.

  • Author_Institution
    Dept. of Math. & Comput. Sci., Denver Univ., CO, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    615
  • Lastpage
    622
  • Abstract
    We present a new approach for approximate nearest neighbor queries for sets of high dimensional points under any Lt-metric, t=1,...,∞. The proposed algorithm is efficient and simple to implement. The algorithm uses multiple shifted copies of the data points and stores them in up to (d+1) B-trees where d is the dimensionality of the data, sorted according to their position along a space filling curve. This is done in a way that allows us to guarantee that a neighbor within an O(d1+1t/) factor of the exact nearest, can be returned with at most (d+1)log, n page accesses, where p is the branching factor of the B-trees. In practice, for real data sets, our approximate technique finds the exact nearest neighbor between 87% and 99% of the time and a point no farther than the third nearest neighbor between 98% and 100% of the time. Our solution is dynamic, allowing insertion or deletion of points in O(d logp n) page accesses and generalizes easily to find approximate k-nearest neighbors
  • Keywords
    data mining; database theory; query processing; tree data structures; very large databases; B-trees; approximate nearest neighbor queries; branching factor; data mining; data sets; high dimensional similarity search; page access; space filling curves; Approximation algorithms; Computer science; Data mining; Filling; Image processing; Mathematics; Nearest neighbor searches; Neural networks; Pattern recognition; Sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2001. Proceedings. 17th International Conference on
  • Conference_Location
    Heidelberg
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-1001-9
  • Type

    conf

  • DOI
    10.1109/ICDE.2001.914876
  • Filename
    914876