• DocumentCode
    2847145
  • Title

    A framework for high-accuracy privacy-preserving mining

  • Author

    Agrawal, Shipra ; Haritsa, Jayant R.

  • Author_Institution
    Database Syst. Lab., Indian Inst. of Sci., Bangalore, India
  • fYear
    2005
  • fDate
    5-8 April 2005
  • Firstpage
    193
  • Lastpage
    204
  • Abstract
    To preserve client privacy in the data mining process, a variety of techniques based on random perturbation of individual data records have been proposed recently. In this paper, we present FRAPP, a generalized matrix-theoretic framework of random perturbation, which facilitates a systematic approach to the design of perturbation mechanisms for privacy-preserving mining. Specifically, FRAPP is used to demonstrate that (a) the prior techniques differ only in their choices for the perturbation matrix elements, and (b) a symmetric perturbation matrix with minimal condition number can be identified, maximizing the accuracy even under strict privacy guarantees. We also propose a novel perturbation mechanism wherein the matrix elements are themselves characterized as random variables, and demonstrate that this feature provides significant improvements in privacy at only a marginal cost in accuracy. The quantitative utility of FRAPP, which applies to random-perturbation-based privacy-preserving mining in general, is evaluated specifically with regard to frequent-itemset mining on a variety of real datasets. Our experimental results indicate that, for a given privacy requirement, substantially lower errors are incurred, with respect to both itemset identity and itemset support, as compared to the prior techniques.
  • Keywords
    data mining; data privacy; matrix algebra; FRAPP framework; data mining process; frequent-itemset mining; matrix-theoretic framework; perturbation matrix elements; privacy-preserving mining framework; random perturbation; random variables; real datasets; Costs; Data mining; Data privacy; Database systems; Electronic commerce; Itemsets; Perturbation methods; Random variables; Symmetric matrices; Transaction databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on
  • ISSN
    1084-4627
  • Print_ISBN
    0-7695-2285-8
  • Type

    conf

  • DOI
    10.1109/ICDE.2005.8
  • Filename
    1410122