• DocumentCode
    3334665
  • Title

    Effective use of space for pivot-based metric indexing structures

  • Author

    Celik, Cengiz

  • Author_Institution
    Dept. of Comput. Eng., Bilkent Univ., Ankara
  • fYear
    2008
  • fDate
    7-12 April 2008
  • Firstpage
    402
  • Lastpage
    409
  • Abstract
    Among the metric space indexing methods, AESA is known to produce the lowest query costs in terms of the number of distance computations. However, its quadratic construction cost and space consumption makes it in feasible for large datasets. There have been some work on reducing the space requirements of AESA. Instead of keeping all the distances between objects, LAESA appoints a subset of the database as pivots, keeping only the distances between objects and pivots. Kvp uses the idea of prioritizing the pivots based on their distances to objects, only keeping pivot distances that it evaluates as promising. FQA discretizes the distances using a fixed amount of bits per distance instead of using system´s floating point types. Varying the number of bits to produce a performance-space trade-off was also studied in Kvp. Recently, BAESA has been proposed based on the same idea, but using different distance ranges for each pivot. The t-spanner based indexing structure compacts the distance matrix by introducing an approximation factor that makes the pivots less effective. In this work, we show that the Kvp prioritization is oriented toward symmetric distance distributions. We offer a new method that evaluates the effectiveness of pivots in a better fashion by making use of the overall distance distribution. We also simulate the performance of our method combined with distance discretization. Our results show that our approach is able to offer very good space- performance trade-offs compared to AESA and tree-based methods.
  • Keywords
    database indexing; query processing; tree data structures; very large databases; Kvp prioritization; approximation factor; distance matrix; floating point type; large dataset; pivot-based metric space indexing structure; quadratic construction cost; space consumption; t-spanner based indexing structure; very large database; Buildings; Conferences; Costs; Databases; Engineering profession; Extraterrestrial measurements; Indexing; Query processing; Symmetric matrices; Tree data structures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshop, 2008. ICDEW 2008. IEEE 24th International Conference on
  • Conference_Location
    Cancun
  • Print_ISBN
    978-1-4244-2161-9
  • Electronic_ISBN
    978-1-4244-2162-6
  • Type

    conf

  • DOI
    10.1109/ICDEW.2008.4498351
  • Filename
    4498351