• DocumentCode
    1631383
  • Title

    The price of privacy in untrusted recommendation engines

  • Author

    Banerjee, Sean ; Hegde, Nayana ; Massoulie, Laurent

  • Author_Institution
    Dept. of ECE, Univ. of Texas at Austin, Austin, TX, USA
  • fYear
    2012
  • Firstpage
    920
  • Lastpage
    927
  • Abstract
    Recent increase in online privacy concerns prompts the following question: can a recommendation engine be accurate if end-users do not entrust it with their private data? To answer this, we study the problem of predicting user-ratings under local or `user-end´ differential privacy, a powerful, formal notion of data privacy. We develop a systematic approach for lower bounds on the complexity of learning item structure from privatized user inputs, based on mutual information. Our results identify a sample complexity separation between learning in the scarce information regime and the rich information regime, thereby highlighting the role of the amount of ratings (information) available to each user. In the information-rich regime (where each user rates a constant fraction of items), a spectral clustering approach is shown to achieve optimal sample complexity. However, the information-scarce regime (where each user rates only a vanishing fraction of the total item set) is found to require a fundamentally different approach. We propose a new algorithm, MaxSense, and show that it achieves optimal sample complexity in this setting. The techniques we develop for bounding mutual information may be of broader interest. To illustrate this, we show their applicability to (i) learning based on 1-bit sketches (in contrast to differentially private sketches), and (ii) adaptive learning, where queries can be adapted based on answers to past queries.
  • Keywords
    Internet; computational complexity; data privacy; query processing; recommender systems; sampling methods; trusted computing; 1-bit sketches; MaxSense algorithm; adaptive learning; data privacy; end-users; information-rich regime; information-scarce regime; learning item structure complexity; lower bounds; mutual information; online privacy concerns; optimal sample complexity; privatized user inputs; queries; spectral clustering approach; untrusted recommendation engines; user-end differential privacy; user-rating prediction; Algorithm design and analysis; Clustering algorithms; Complexity theory; Data privacy; Mutual information; Privacy; Recommender systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton Conference on
  • Conference_Location
    Monticello, IL
  • Print_ISBN
    978-1-4673-4537-8
  • Type

    conf

  • DOI
    10.1109/Allerton.2012.6483317
  • Filename
    6483317