• DocumentCode
    1451379
  • Title

    Multiple similarity queries: a basic DBMS operation for mining in metric databases

  • Author

    Braunmüller, Bernhard ; Ester, Martin ; Kriegel, Hans-Peter ; Sander, Jörg

  • Author_Institution
    Inst. of Comput. Sci., Munchen Univ., Germany
  • Volume
    13
  • Issue
    1
  • fYear
    2001
  • Firstpage
    79
  • Lastpage
    95
  • Abstract
    Metric databases are databases where a metric distance function is defined for pairs of database objects. In such databases, similarity queries in the form of range queries or k-nearest-neighbor queries are the most important query types. In traditional query processing, single queries are issued independently by different users. In many data mining applications, however, the database is typically explored by iteratively asking similarity queries for answers of previous similarity queries. We introduce a generic scheme for such data mining algorithms and we investigate two orthogonal approaches, reducing I/O cost as well as CPU cost, to speed-up the processing of multiple similarity queries. The proposed techniques apply to any type of similarity query and to an implementation based on an index or using a sequential scan. Parallelization yields an additional impressive speed-up. An extensive performance evaluation confirms the efficiency of our approach
  • Keywords
    data mining; database indexing; query processing; software performance evaluation; very large databases; CPU cost; data mining; index; input output cost; k-nearest-neighbor queries; large databases; metric databases; metric distance function; multiple similarity queries; performance evaluation; query processing; range queries; sequential scan; similarity queries; Clustering algorithms; Computer Society; Costs; Data mining; Extraterrestrial measurements; Indexing; Iterative algorithms; Multimedia databases; Query processing; Spatial databases;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/69.908982
  • Filename
    908982