DocumentCode :
1844916
Title :
Applying sensitivity analysis to missing data in classifiers
Author :
Lei, Lei ; Wu, Naijun ; Liu, Peng
Author_Institution :
Sch. of Inf. Manage. & Eng., Shanghai Univ. of Finance & Econ., China
Volume :
2
fYear :
2005
fDate :
13-15 June 2005
Firstpage :
1051
Abstract :
Among all the technologies of data mining, predictive classification has a wide range of application. People do some prediction by building up classification models and hope to achieve high classification accuracy. However, there are always some data quality problems in the datasets, which will affect the accuracy of classification models. For example, missing data is a common problem in datasets. In this paper, we investigates the influence of missing data to classifiers. Firstly, basic knowledge about data quality and sensitivity analysis is introduced briefly. Then, the sensitivity of six representative classifiers to missing data is studied by sensitivity experiments. The results indicate that, in the datasets, when the proportion of missing data exceeds 20%, they do have a huge adverse impact on the classification accuracy of the model. Moreover, missing data have different effects on different datasets according to their characteristics. Among the six classifiers, the naive Bayesian classifier is the least sensitive to missing data.
Keywords :
backpropagation; belief networks; data mining; decision trees; sensitivity analysis; classification accuracy; classification models; data mining; data quality problems; missing data; naive Bayesian classifier; predictive classification; sensitivity analysis; Classification algorithms; Data engineering; Data mining; Data warehouses; Databases; Delta modulation; Economic forecasting; Finance; Information management; Sensitivity analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Services Systems and Services Management, 2005. Proceedings of ICSSSM '05. 2005 International Conference on
Print_ISBN :
0-7803-8971-9
Type :
conf
DOI :
10.1109/ICSSSM.2005.1500155
Filename :
1500155
Link To Document :
بازگشت