DocumentCode :
146747
Title :
Sanitizing and Minimizing Databases for Software Application Test Outsourcing
Author :
Boyang Li ; Grechanik, Mark ; Poshyvanyk, Denys
Author_Institution :
Coll. of William & Mary, Williamsburg, VA, USA
fYear :
2014
fDate :
March 31 2014-April 4 2014
Firstpage :
233
Lastpage :
242
Abstract :
Testing software applications that use nontrivial databases is increasingly outsourced to test centers in order to achieve lower cost and higher quality. Not only do different data privacy laws prevent organizations from sharing this data with test centers because databases contain sensitive information, but also this situation is aggravated by big data - it is time consuming and difficult to anonymize, distribute, and test with large databases. Deleting data randomly often leads to significantly worsened test coverages and fewer uncovered faults, thereby reducing the quality of software applications. We propose a novel approach for Protecting and mInimizing databases for Software TestIng taSks (PISTIS) that both sanitizes and minimizes a database that comes along with an application. PISTIS uses a weight-based data clustering algorithm that partitions data in the database using information obtained using program analysis that describes how this data is used by the application. For each cluster, a centroid object is computed that represents different persons or entities in the cluster, and we use associative rule mining to compute and use constraints to ensure that the centroid objects are representative of the general population of the data in the cluster. Doing so also sanitizes information, since these centroid objects replace the original data to make it difficult for attackers to infer sensitive information. Thus, we reduce a large database to a few centroid objects and we show in our experiments with two applications that test coverage stays within a close range to its original level.
Keywords :
Big Data; data mining; database management systems; outsourcing; pattern clustering; program diagnostics; program testing; PISTIS; associative rule mining; big data; centroid object; database minimization; database sanitization; nontrivial databases; program analysis; protecting and minimizing databases for software testing tasks; software application test outsourcing; weight-based data clustering algorithm; Chemotherapy; Data privacy; Databases; Organizations; Software; Software testing; anonymity; clustering; data compression; privacy; program analysis; software testing; test coverage;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Testing, Verification and Validation (ICST), 2014 IEEE Seventh International Conference on
Conference_Location :
Cleveland, OH
Type :
conf
DOI :
10.1109/ICST.2014.36
Filename :
6823885
Link To Document :
بازگشت