• DocumentCode
    3591246
  • Title

    A Cost-Efficient and Versatile Sanitizing Algorithm by Using a Greedy Approach

  • Author

    Wu, Chieh-Ming ; Huang, Yin-Fu

  • Author_Institution
    Grad. Sch. of Eng. Sci. & Technol., Nat. Yunlin Univ. of Sci. & Technol., Touliu, Taiwan
  • Volume
    2
  • fYear
    2009
  • Firstpage
    23
  • Lastpage
    27
  • Abstract
    In a very large database, there exists sensitive information that must be protected against unauthorized accesses. The confidentiality protection of the information has been a long-term goal pursued by the database security research community and the government statistical agencies. In this paper, we proposed greedy methods for hiding sensitive rules. The experimental results showed the effectiveness of our approaches in terms of undesired side effects avoided in the rule hiding process. The results also revealed that in most cases, all the sensitive rules are hidden without generating spurious rules. First, the good scalability of our approach in terms of database sizes was achieved by using an efficient data structure FCET to store only maximal frequent itemsets instead of storing all frequent itemsets. Furthermore, we also proposed a new framework for enforcing the privacy in mining association rules. In the framework, we combined the techniques for efficiently hiding sensitive rules with the transaction retrieval engine based on the FCET index tree. For hiding sensitive rules, the proposed greedy approach includes a greedy approximation algorithm and a greedy exhausted one to sanitize the database. In particular, we presented four strategies in the sanitizing procedure and four strategies in the exposed procedure, respectively, for hiding a group of association rules characterized as sensitive or artificial rules. In addition, the exposed procedure would expose missing rules during the processing so that the number of missing rules could be lowered as possible as we can.
  • Keywords
    data mining; greedy algorithms; security of data; very large databases; FCET index tree; database security research community; efficient data structure; government statistical agencies; greedy approach; greedy approximation algorithm; information protection; maximal frequent itemsets; mining association rules; rule hiding process; sanitizing algorithm; transaction retrieval engine; very large database; Association rules; Data security; Data structures; Databases; Government; Information security; Itemsets; Privacy; Protection; Scalability; FCET; greedy method; maximal frequent itemsets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
  • Print_ISBN
    978-0-7695-3735-1
  • Type

    conf

  • DOI
    10.1109/FSKD.2009.914
  • Filename
    5358937