• DocumentCode
    87766
  • Title

    Diverse Set Selection Over Dynamic Data

  • Author

    Drosou, Marina ; Pitoura, Evaggelia

  • Author_Institution
    Comput. Sci. Dept., Univ. of Ioannina, Ioannina, Greece
  • Volume
    26
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    1102
  • Lastpage
    1116
  • Abstract
    Result diversification has recently attracted considerable attention as a means of increasing user satisfaction in recommender systems, as well as in web and database search. In this paper, we focus on the problem of selecting the k-most diverse items from a result set. Whereas previous research has mainly considered the static version of the problem, in this paper, we exploit the dynamic case in which the result set changes over time, as for example, in the case of notification services. We define the CONTINUOUS k-DIVERSITY PROBLEM along with appropriate constraints that enforce continuity requirements on the diversified results. Our proposed approach is based on cover trees and supports dynamic item insertion and deletion. The diversification problem is in general NP-hard; we provide theoretical bounds that characterize the quality of our cover tree solution with respect to the optimal one. Since results are often associated with a relevance score, we extend our approach to account for relevance. Finally, we report experimental results concerning the efficiency and effectiveness of our approach on a variety of real and synthetic datasets.
  • Keywords
    Internet; computational complexity; database management systems; human factors; information filtering; optimisation; recommender systems; tree data structures; NP-hard diversification problem; continuity requirements; continuous k-diversity problem; cover tree solution; database search; dynamic item deletion; dynamic item insertion; indexing methods; information filtering; k-most diverse item selection; real datasets; recommender systems; synthetic datasets; user satisfaction; Approximation algorithms; Complexity theory; Computational modeling; Diversity reception; Heuristic algorithms; Indexes; Silicon; Database Applications; Database Management; Indexing methods; Information Search and Retrieval; Information Storage and Retrieval; Information Technology and Systems; Information filtering; Physical Design; Selection process; information filtering; search process; selection process; similarity measures;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.44
  • Filename
    6477041