• DocumentCode
    3322785
  • Title

    On the Anonymization of Sparse High-Dimensional Data

  • Author

    Ghinita, Gabriel ; Tao, Yufei ; Kalnis, Panos

  • Author_Institution
    Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore
  • fYear
    2008
  • fDate
    7-12 April 2008
  • Firstpage
    715
  • Lastpage
    724
  • Abstract
    Existing research on privacy-preserving data publishing focuses on relational data: in this context, the objective is to enforce privacy-preserving paradigms, such as k- anonymity and lscr-diversity, while minimizing the information loss incurred in the anonymizing process (i.e. maximize data utility). However, existing techniques adopt an indexing- or clustering- based approach, and work well for fixed-schema data, with low dimensionality. Nevertheless, certain applications require privacy-preserving publishing of transaction data (or basket data), which involves hundreds or even thousands of dimensions, rendering existing methods unusable. We propose a novel anonymization method for sparse high-dimensional data. We employ a particular representation that captures the correlation in the underlying data, and facilitates the formation of anonymized groups with low information loss. We propose an efficient anonymization algorithm based on this representation. We show experimentally, using real-life datasets, that our method clearly outperforms existing state-of-the-art in terms of both data utility and computational overhead.
  • Keywords
    data privacy; data structures; database indexing; pattern clustering; relational databases; security of data; very large databases; clustering-based approach; data representation; indexing-based approach; privacy-preserving data publishing; relational database; sparse high-dimensional data anonymization algorithm; Computer science; Data engineering; Data mining; Data privacy; Hospitals; Pregnancy test; Protection; Publishing; Tin; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
  • Conference_Location
    Cancun
  • Print_ISBN
    978-1-4244-1836-7
  • Electronic_ISBN
    978-1-4244-1837-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2008.4497480
  • Filename
    4497480