• DocumentCode
    2026301
  • Title

    TOP-K cosine similarity interesting pairs search

  • Author

    Zhu, Shiwei ; Wu, Junjie ; Xia, Guoping

  • Author_Institution
    Sch. of Econ. & Manage., Beihang Univ., Beijing, China
  • Volume
    3
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    1479
  • Lastpage
    1483
  • Abstract
    Recent years have witnessed an increased interest in computing cosine similarities between documents (or commodities). Most previous studies require the specification of a minimum similarity threshold to perform cosine similarity search. However, it is usually difficult for users to provide an appropriate threshold in practice. Instead, in this paper, we propose to search top-K strongly related pairs of objects as measured by the cosine similarity. Specifically, we first define the cosine similarity measure from the association analysis point of view and identify the monotone property of an upper bound of the cosine measure, then exploit a diagonal traversal strategy for developing the TOP-DATA and TOP-DATA-R algorithms. Finally, experimental results demonstrate the computational efficiencies of above algorithms.
  • Keywords
    data mining; discrete cosine transforms; search problems; TOP-DATA-R algorithms; TOP-K cosine similarity measure; computing cosine similarity search; data association; data mining; diagonal traversal strategy; minimum similarity threshold; pairs search; Arrays; Complexity theory; Correlation; Data mining; Upper bound; Vectors; Anti-Monotone Property; Association Analysis; Cosine Similarity; Interestingness Measure; Similarity Search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5931-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2010.5569212
  • Filename
    5569212