DocumentCode
3164174
Title
Permutation-Based Sequential Pattern Hiding
Author
Gwadera, Robert ; Gkoulalas-Divanis, A. ; Loukides, G.
Author_Institution
EPFL, Lausanne, Switzerland
fYear
2013
fDate
7-10 Dec. 2013
Firstpage
241
Lastpage
250
Abstract
Sequence data are increasingly shared to enable mining applications, in various domains such as marketing, telecommunications, and healthcare. This, however, may expose sensitive sequential patterns, which lead to intrusive inferences about individuals or leak confidential information about organizations. This paper presents the first permutation-based approach to prevent this threat. Our approach hides sensitive patterns by replacing them with carefully selected permutations that avoid changes in the set of frequent nonsensitive patterns (side-effects) and in the ordering information of sequences (distortion). By doing so, it retains data utility in sequence mining and tasks based on item set properties, as permutation preserves the support of items, unlike deletion, which is used in existing works. To realize our approach, we develop an efficient and effective algorithm for generating permutations with minimal side-effects and distortion. This algorithm also avoids implausible symbol orderings that may exist in certain applications. In addition, we propose a method to hide sensitive patterns from a sequence dataset. Extensive experiments verify that our method allows significantly more accurate data analysis than the state-of the-art approach.
Keywords
data encapsulation; data mining; distortion; data analysis; data mining; data utility; distortion; implausible symbol orderings; intrusive inferences; item set property; minimal side-effects; permutation-based sequential pattern hiding; sensitive sequential patterns; sequence data; sequence dataset; sequence mining; Algorithm design and analysis; Companies; Data mining; Insurance; Itemsets; Pattern matching; data privacy; permutation; sequential pattern hiding;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location
Dallas, TX
ISSN
1550-4786
Type
conf
DOI
10.1109/ICDM.2013.57
Filename
6729508
Link To Document