• DocumentCode
    1559477
  • Title

    Practical data-oriented microaggregation for statistical disclosure control

  • Author

    Domingo-Ferrer, J. ; Mateo-Sanz, Josep M.

  • Author_Institution
    Dept. of Comput. Sci., Univ. Rovira i Virgli, Catalonia, Spain
  • Volume
    14
  • Issue
    1
  • fYear
    2002
  • Firstpage
    189
  • Lastpage
    201
  • Abstract
    Microaggregation is a statistical disclosure control technique for microdata disseminated in statistical databases. Raw microdata (i.e., individual records or data vectors) are grouped into small aggregates prior to publication. Each aggregate should contain at least k data vectors to prevent disclosure of individual information, where k is a constant value preset by the data protector. No exact polynomial algorithms are known to date to microaggregate optimally, i.e., with minimal variability loss. Methods in the literature rank data and partition them into groups of fixed-size; in the multivariate case, ranking is performed by projecting data vectors onto a single axis. In this paper, candidate optimal solutions to the multivariate and univariate microaggregation problems are characterized. In the univariate case, two heuristics based on hierarchical clustering and genetic algorithms are introduced which are data-oriented in that they try to preserve natural data aggregates. In the multivariate case, fixed-size and hierarchical clustering microaggregation algorithms are presented which do not require data to be projected onto a single dimension; such methods clearly reduce variability loss as compared to conventional multivariate microaggregation on projected data
  • Keywords
    data privacy; genetic algorithms; statistical databases; genetic algorithms; hierarchical clustering; microaggregation; microdata; statistical databases; statistical disclosure; Aggregates; Clustering algorithms; Databases; Genetic algorithms; Partitioning algorithms; Polynomials; Protection;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/69.979982
  • Filename
    979982