• DocumentCode
    2413079
  • Title

    Distributed Data Mining with Differential Privacy

  • Author

    Zhang, Ning ; Li, Ming ; Lou, Wenjing

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Worcester Polytech. Inst., Worcester, MA, USA
  • fYear
    2011
  • fDate
    5-9 June 2011
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    With recent advances in communication and data storage technology, an explosive amount of information is being collected and stored in the Internet. Even though such vast amount of information presents great opportunities for knowledge discovery, organizations might not want to share their data due to legal or competitive reasons. This posts the challenge of mining knowledge while preserving privacy. Current efficient privacy-preserving data mining algorithms are based on an assumption that it is acceptable to release all the intermediate results during the data mining operations. However, it has been shown that such intermediate results can still leak private information. In this work, we use differential privacy to quantitatively limit such information leak. Differential privacy is a newly emerged privacy definition that is capable of providing strong measurable privacy guarantees. We propose Secure group Differential private Query (SDQ), a new algorithm that combines techniques from differential privacy and secure multiparty computation. Using decision tree induction as a case study, we show that SDQ can achieve stronger privacy than current efficient secure multiparty computation approach, and better accuracy than current differential privacy approach while maintaining efficiency.
  • Keywords
    Internet; data mining; data privacy; decision trees; groupware; law; storage management; Internet; competitive reasons; data communication; data storage technology; decision tree; distributed data mining; knowledge discovery; legal reasons; private information; secure group differential private query; secure multiparty computation; Data privacy; Decision trees; Distributed databases; Noise; Privacy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications (ICC), 2011 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1550-3607
  • Print_ISBN
    978-1-61284-232-5
  • Electronic_ISBN
    1550-3607
  • Type

    conf

  • DOI
    10.1109/icc.2011.5962863
  • Filename
    5962863