DocumentCode :
710098
Title :
Privacy-aware dynamic feature selection
Author :
Pattuk, Erman ; Kantarcioglu, Murat ; Ulusoy, Huseyin ; Malin, Bradley
Author_Institution :
Univ. of Texas at Dallas, Richardson, TX, USA
fYear :
2015
fDate :
13-17 April 2015
Firstpage :
78
Lastpage :
88
Abstract :
Big data will enable the development of novel services that enhance a company´s market advantage, competition, or productivity. At the same time, the utilization of such a service could disclose sensitive data in the process, which raises significant privacy concerns. To protect individuals, various policies, such as the Code of Fair Information Practices, as well as recent laws require organizations to capture only the minimal amount of data necessary to support a service. While this is a notable goal, choosing the minimal data is a non-trivial process, especially while considering privacy and utility constraints. In this paper, we introduce a technique to minimize sensitive data disclosure by focusing on privacy-aware feature selection. During model deployment, the service provider requests only a subset of the available features from the client, such that it can produce results with maximal confidence, while minimizing its ability to violate a client´s privacy. We propose an iterative approach, where the server requests information one feature at a time until the client-specified privacy budget is exhausted. The overall process is dynamic, such that the feature selected at each step depends on the previously selected features and their corresponding values. We demonstrate our technique with three popular classification algorithms and perform an empirical analysis over three real world datasets to illustrate that, in almost all cases, classifiers that select features using our strategy have the same error-rate as state-of-the art static feature selection methods that fail to preserve privacy.
Keywords :
Big Data; data privacy; feature selection; pattern classification; Big Data; classifier; empirical analysis; error-rate; iterative approach; privacy constraints; privacy-aware dynamic feature selection; sensitive data disclosure minimization; utility constraints; Data privacy; Decision trees; Measurement; Niobium; Privacy; Probability; Servers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2015 IEEE 31st International Conference on
Conference_Location :
Seoul
Type :
conf
DOI :
10.1109/ICDE.2015.7113274
Filename :
7113274
Link To Document :
بازگشت