• DocumentCode
    1245663
  • Title

    Antipole tree indexing to support range search and k-nearest neighbor search in metric spaces

  • Author

    Cantone, Domenico ; Ferro, Alfredo ; Pulvirenti, Alfredo ; Recupero, Diego Reforgiato ; Shasha, Dennis

  • Author_Institution
    Dipt. di Matematica e Informatica, Catania Univ., Italy
  • Volume
    17
  • Issue
    4
  • fYear
    2005
  • fDate
    4/1/2005 12:00:00 AM
  • Firstpage
    535
  • Lastpage
    550
  • Abstract
    Range and k-nearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the goal is to find the objects of S within some threshold distance to g, whereas in a k-nearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the M-tree, the multivantage point structure, and the FQ-tree to create a new structure in the "bisector tree" class, called the Antipole tree. Bisection is based on the proximity to an "Antipole" pair of elements generated by a suitable linear randomized tournament. The final winners a, b of such a tournament is far enough apart to approximate the diameter of the splitting set. If dist(a, b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for (exact and approximate) best match searching on generic metric spaces. The Antipole tree outperforms by a factor of approximately two existing structures such as list of clusters, M-trees, and others and, in many cases, it achieves better clustering properties.
  • Keywords
    database indexing; pattern clustering; query processing; tree data structures; tree searching; Antipole tree indexing; information retrieval; k-nearest neighbor search; metric space; pattern recognition; query processing; tree data structure; Algorithm design and analysis; Binary trees; Clustering algorithms; Data structures; Databases; Extraterrestrial measurements; Indexing; Information retrieval; Partitioning algorithms; Pattern recognition; Index Terms- Indexing methods; information search and retrieval.; similarity measures;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2005.53
  • Filename
    1401892