• DocumentCode
    124267
  • Title

    Protecting Data Privacy from Being Inferred from High Dimensional Correlated Data

  • Author

    Huafeng Ba ; Xiaoming Gao ; Xiaofeng Zhang ; Zhenyu He

  • Author_Institution
    Coll. of Comput. Sci., Harbin Inst. of Technol., Xili, China
  • Volume
    2
  • fYear
    2014
  • fDate
    11-14 Aug. 2014
  • Firstpage
    495
  • Lastpage
    502
  • Abstract
    In the era of big data, privacy becomes a challenging issue which already attracts a good number of research efforts. In the literature, most of existing privacy preserving algorithms focus on protecting users´ privacy from being disclosed by making the set of designated semi-id features indiscriminate. However, how to automatically determine the appropriate semi-id features from high-dimensional correlated data is seldom studied. Therefore, in this paper we first theoretically study the problem and propose the IPFS algorithm to find all possible features forming the candidate semi-id feature set which can infer users´ privacy. Then, the KIPFS algorithm is proposed to find the key features from the candidate semi-id feature set. By anonymizing the key feature set, called as key inferring privacy features (KIPFS), users´ privacy is protected. To evaluate the effectiveness and the efficacy of the proposed approach, two state-of-the-art algorithms, i.e., K-anonymity and t-closeness, applied on the designated semi-id feature set are chose as the baseline algorithms and their revised versions are applied on the KIPFS for the performance comparison. The promising results showed that by anonymizing the identified KIPFS, both aforementioned algorithms can achieve better performance than the original ones in terms of efficiency and data quality.
  • Keywords
    Big Data; data protection; feature selection; KIPFS algorithm; big data; data privacy protection; high-dimensional correlated data; k-anonymity; key inferring privacy features; privacy preserving algorithms; semiid feature set; t-closeness; Algorithm design and analysis; Computer science; Data analysis; Data privacy; Educational institutions; Equations; Privacy; algorithm; data publishing; privacy preserving data mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Warsaw
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2014.139
  • Filename
    6927666