Title :
Probability Distribution Reconstruction for Nominal Attributes in Privacy Preserving Classification
Author :
Andruszkiewicz, Piotr
Author_Institution :
Inst. of Comput. Sci., Warsaw Univ. of Technol., Warsaw
Abstract :
Concerns about privacy of data used in data mining have emerged recently. Users are afraid of misuse of this data and discovered knowledge. Thus several methods of preserving privacy classification have been proposed in literature. One of these methods enables miners to use continuous and nominal attributes simultaneously in classification. Reconstruction of probability distribution is an important task in privacy preserving classification for both nominal and continuous attributes which were distorted with the randomization-based technique and are stored in centralized database. We present the new algorithm - EQ - for reconstruction of probability distribution of nominal attributes, which outperforms former algorithm especially for high privacy levels. Effectiveness of the new solution (information loss in reconstruction of probability distribution of nominal attributes and accuracy of classification) has been tested and presented in this paper.
Keywords :
data mining; data privacy; pattern classification; statistical distributions; centralized database; continuous attribute; data mining; knowledge discovery; nominal attribute; privacy preserving classification; probability distribution reconstruction; randomization-based technique; Association rules; Computer science; Cryptography; Data mining; Data privacy; Decision trees; Distributed databases; Information technology; Probability distribution; Testing;
Conference_Titel :
Convergence and Hybrid Information Technology, 2008. ICHIT '08. International Conference on
Conference_Location :
Daejeon
Print_ISBN :
978-0-7695-3328-5
DOI :
10.1109/ICHIT.2008.253