• DocumentCode
    2981933
  • Title

    Differentially Private Histogram Publishing through Lossy Compression

  • Author

    Acs, Gergely ; Castelluccia, C. ; Rui Chen

  • Author_Institution
    INRIA, Sophia-Antipolis, France
  • fYear
    2012
  • fDate
    10-13 Dec. 2012
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Differential privacy has emerged as one of the most promising privacy models for private data release. It can be used to release different types of data, and, in particular, histograms, which provide useful summaries of a dataset. Several differentially private histogram releasing schemes have been proposed recently. However, most of them directly add noise to the histogram counts, resulting in undesirable accuracy. In this paper, we propose two sanitization techniques that exploit the inherent redundancy of real-life datasets in order to boost the accuracy of histograms. They lossily compress the data and sanitize the compressed data. Our first scheme is an optimization of the Fourier Perturbation Algorithm (FPA) presented in [13]. It improves the accuracy of the initial FPA by a factor of 10. The other scheme relies on clustering and exploits the redundancy between bins. Our extensive experimental evaluation over various real-life and synthetic datasets demonstrates that our techniques preserve very accurate distributions and considerably improve the accuracy of range queries over attributed histograms.
  • Keywords
    Fourier analysis; data compression; data privacy; pattern clustering; perturbation techniques; publishing; FPA; Fourier perturbation algorithm; attributed histograms; compressed data sanitization techniques; differential privacy models; differentially private histogram publishing; differentially private histogram releasing schemes; lossy compression; pattern clustering; real-life dataset inherent redundancy; real-life datasets; synthetic datasets; Data privacy; Databases; Discrete Fourier transforms; Histograms; Noise; Privacy; Sensitivity; Differential privacy; Fourier transform; clustering; histogram; lossy compression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2012 IEEE 12th International Conference on
  • Conference_Location
    Brussels
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4673-4649-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2012.80
  • Filename
    6413718