Title :
A Formal Usability Constraints Model for Watermarking of Outsourced Datasets
Author :
Kamran, Muhammad ; Farooq, M.
Author_Institution :
Dept. of Comput. Sci., Nat. Univ. of Comput. & Emerging Sci. (NUCES), Islamabad, Pakistan
Abstract :
The large datasets are being mined to extract hidden knowledge and patterns that assist decision makers in making effective, efficient, and timely decisions in an ever increasing competitive world. This type of “knowledge-driven” data mining activity is not possible without sharing the “datasets” between their owners and data mining experts (or corporations); as a consequence, protecting ownership (by embedding a watermark) on the datasets is becoming relevant. The most important challenge in watermarking (to be mined) datasets is: how to preserve knowledge in features or attributes? Usually, an owner needs to manually define “Usability constraints” for each type of dataset to preserve the contained knowledge. The major contribution of this paper is a novel formal model that facilitates a data owner to define usability constraints-to preserve the knowledge contained in the dataset-in an automated fashion. The model aims at preserving “classification potential” of each feature and other major characteristics of datasets that play an important role during the mining process of data; as a result, learning statistics and decision-making rules also remain intact. We have implemented our model and integrated it with a new watermark embedding algorithm to prove that the inserted watermark not only preserves the knowledge contained in a dataset but also significantly enhances watermark security compared with existing techniques. We have tested our model on 25 different data-mining datasets to show its efficacy, effectiveness, and the ability to adapt and generalize.
Keywords :
data mining; formal specification; formal verification; learning (artificial intelligence); pattern classification; watermarking; classification potential; decision-making rule; formal usability constraints model; knowledge extraction; knowledge preservation; knowledge-driven data mining; learning statistics; outsourced dataset watermarking; pattern extraction; watermark embedding algorithm; watermark security; Data mining; Data models; Databases; Mutual information; Numerical models; Usability; Watermarking; Data usability; knowledge-preserving; ownership-preserving data mining; right protection; watermarking datasets;
Journal_Title :
Information Forensics and Security, IEEE Transactions on
DOI :
10.1109/TIFS.2013.2259234