DocumentCode
140850
Title
Generating private synthetic databases for untrusted system evaluation
Author
Wentian Lu ; Miklau, Gerome ; Gupta, V.
Author_Institution
Sch. of Comput. Sci., Univ. of Massachusetts Amherst, Amherst, MA, USA
fYear
2014
fDate
March 31 2014-April 4 2014
Firstpage
652
Lastpage
663
Abstract
Evaluating the performance of database systems is crucial when database vendors or researchers are developing new technologies. But such evaluation tasks rely heavily on actual data and query workloads that are often unavailable to researchers due to privacy restrictions. To overcome this barrier, we propose a framework for the release of a synthetic database which accurately models selected performance properties of the original database. We improve on prior work on synthetic database generation by providing a formal, rigorous guarantee of privacy. Accuracy is achieved by generating synthetic data using a carefully selected set of statistical properties of the original data which balance privacy loss with relevance to the given query workload. An important contribution of our framework is an extension of standard differential privacy to multiple tables.
Keywords
data privacy; database management systems; statistical analysis; trusted computing; balance privacy loss; database researchers; database vendors; differential privacy; privacy guarantee; privacy restrictions; private synthetic database generation; query workloads; statistical properties; synthetic data generation; untrusted system evaluation; Aggregates; Data privacy; Databases; Noise; Privacy; Sensitivity; Standards;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2014 IEEE 30th International Conference on
Conference_Location
Chicago, IL
Type
conf
DOI
10.1109/ICDE.2014.6816689
Filename
6816689
Link To Document