• DocumentCode
    2059920
  • Title

    An Improved Profile-Based CF Scheme with Privacy

  • Author

    Bilge, Alper ; Polat, Huseyin

  • Author_Institution
    Dept. of Comput. Eng., Anadolu Univ., Eskisehir, Turkey
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    133
  • Lastpage
    140
  • Abstract
    Traditional collaborative filtering (CF) systems widely employing k-nearest neighbor (kNN) algorithms mostly attempt to alleviate the contemporary problem of information overload by generating personalized predictions for items that users might like. Unlike their popularity and extensive usage, they suffer from several problems. First, with increasing number of users and/or items, scalability becomes a challenge. Second, as the number of ratable items increases and number of ratings provided by each individual remains as a tiny fraction, CF systems suffer from sparsity problem. Third, many schemes fail to protect private data referred to as privacy problem. Due to such problems, accuracy and online performance become worse. In this paper, we propose two preprocessing schemes to overcome scalability and sparsity problems. First, we suggest using a novel content-based profiling of users to estimate similarities on a reduced data for better performance. Second, we propose pseudo-prediction protocol to help CF systems surmount sparsity. We finally propose to use randomization methods to preserve individual users´ confidential data, where we show that our proposed preprocessing schemes can be applied to perturbed data. We analyze our schemes in terms of privacy. To investigate their effects on accuracy and performance, we perform real databased experiments. Empirical results demonstrate that our preprocessing schemes improve both performance and accuracy.
  • Keywords
    content-based retrieval; data privacy; data reduction; groupware; information filtering; pattern matching; recommender systems; collaborative filtering system; content-based users profile; data privacy; data reduction; k-nearest neighbor algorithm; preprocessing scheme; profile-based CF scheme; pseudoprediction protocol; randomization method; sparsity problem; Accuracy; Data privacy; Motion pictures; Prediction algorithms; Privacy; Scalability; accuracy; performance; preprocessing; privacy; profiling; recommendation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
  • Conference_Location
    Palo Alto, CA
  • Print_ISBN
    978-1-4577-1648-5
  • Electronic_ISBN
    978-0-7695-4492-2
  • Type

    conf

  • DOI
    10.1109/ICSC.2011.20
  • Filename
    6061422