• DocumentCode
    2985197
  • Title

    Dimensional Testing for Multi-step Similarity Search

  • Author

    Houle, Michael E. ; Xiguo Ma ; Nett, Michael ; Oria, Vincent

  • Author_Institution
    Nat. Inst. of Inf., Tokyo, Japan
  • fYear
    2012
  • fDate
    10-13 Dec. 2012
  • Firstpage
    299
  • Lastpage
    308
  • Abstract
    In data mining applications such as subspace clustering or feature selection, changes to the underlying feature set can require the reconstruction of search indices to support fundamental data mining tasks. For such situations, multi-step search approaches have been proposed that can accommodate changes in the underlying similarity measure without the need to rebuild the index. In this paper, we present a heuristic multi-step search algorithm that utilizes a measure of intrinsic dimension, the generalized expansion dimension (GED), as the basis of its search termination condition. Compared to the current state-of-the-art method, experimental results show that our heuristic approach is able to obtain significant improvements in both the number of candidates and the running time, while losing very little in the accuracy of the query results.
  • Keywords
    data mining; pattern clustering; query processing; GED; data mining; dimensional testing; feature selection; feature set; generalized expansion dimension; heuristic multistep search algorithm; intrinsic dimension; multistep similarity search; query result; search indices; search termination condition; similarity measure; subspace clustering; Algorithm design and analysis; Approximation algorithms; Data mining; Heuristic algorithms; Indexes; Measurement; Vectors; Similarity search; adaptive similarity; intrinsic dimensionality; kNN; multi-step; nearest neighbor;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2012 IEEE 12th International Conference on
  • Conference_Location
    Brussels
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4673-4649-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2012.91
  • Filename
    6413893