DocumentCode :
1967534
Title :
Using unknowns for hiding sensitive predictive association rules
Author :
Wang, Shyue-Liang ; Jafari, Ayat
Author_Institution :
Dept. of Comput. Sci., New York Inst. of Technol., NY, USA
fYear :
2005
fDate :
15-17 Aug. 2005
Firstpage :
223
Lastpage :
228
Abstract :
Privacy-preserving data mining is a novel research direction in data mining and statistical databases, where data mining algorithms are analyzed for the side effects they incur in data privacy. There have been two types of privacy proposed concerning data mining. The first type of privacy, called output privacy, is that the data is altered so that the mining result will preserve certain privacy. The second type of privacy, called input privacy, is that the data is manipulated so that the mining result is not affected or minimally affected. In output privacy, given specific rules to be hidden, many data altering techniques for hiding association, classification and clustering rules have been proposed. However, to specify hidden rules, entire data mining process needs to be executed. For some applications, we are only interested in hiding certain sensitive predictive rules that contain given items. A predictive association rule set is the smallest rule set that makes the same prediction as the whole association rule set by confidence priority. In this work, we assume that only sensitive items are given and propose two algorithms, ISL (increase support of LHS) and DSR (decrease support of RHS), to replace data by unknowns in database so that sensitive predicative rules containing specified items on the left hand side of rule cannot be inferred through association rule mining. Examples illustrating the proposed algorithms are given. The characteristics of the algorithms are analyzed. The efficiency of the proposed approach is further compared with Saygin etc. approach. It is observed that our approach required less number of databases scanning and prune more number of hidden rules. However, our approach must hide all rules containing the hidden items on the left hand side, where Saygin etc approach can hide any specific rule.
Keywords :
data encapsulation; data mining; data privacy; statistical databases; classification rule; clustering rule; data altering technique; data mining; data privacy; databases scanning; sensitive predictive association rules hiding; statistical database; Algorithm design and analysis; Association rules; Computer science; Cryptography; Data mining; Data privacy; Merging; Sampling methods; Transaction databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Reuse and Integration, Conf, 2005. IRI -2005 IEEE International Conference on.
Print_ISBN :
0-7803-9093-8
Type :
conf
DOI :
10.1109/IRI-05.2005.1506477
Filename :
1506477
Link To Document :
بازگشت