Title :
PerturBoost: Practical Confidential Classifier Learning in the Cloud
Author :
Keke Chen ; Shumin Guo
Author_Institution :
Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA
Abstract :
Mining large data requires intensive computing resources and data mining expertise, which might not be available for many users. With the development of cloud computing and services computing, data mining tasks can now be moved to the cloud or outsourced to third parties to save costs. In this new paradigm, data and model confidentiality becomes the major concern to the data owner. Meanwhile, users are also concerned about the potential tradeoff among costs, model quality, and confidentiality. In this paper, we propose the PerturBoost framework to address the problems in confidential cloud or outsourced learning. PerturBoost combined with the random space perturbation (RASP) method that was also developed by us can effectively protect data confidentiality, model confidentiality, and model quality with low client-side costs. Based on the boosting framework, we develop a number of base learner algorithms that can learn linear classifiers from the RASP-perturbed data. This approach has been evaluated with public datasets. The result shows that the RASP-based PerturBoost can provide model accuracy very close to the classifiers trained with the original data and the AdaBoost method, with high confidentiality guarantee and acceptable costs.
Keywords :
cloud computing; costing; data mining; learning (artificial intelligence); pattern classification; security of data; AdaBoost method; PerturBoost framework; RASP method; RASP-perturbed data; base learner algorithm; boosting framework; cloud computing; confidential cloud; cost; data confidentiality; data mining; linear classifier learning; model confidentiality; model quality; outsourced learning; practical confidential classifier learning; random space perturbation method; services computing; Accuracy; Cloud computing; Computational modeling; Data mining; Data models; Servers; Vectors;
Conference_Titel :
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
DOI :
10.1109/ICDM.2013.118