DocumentCode :
3726481
Title :
Classification Uncertainty of Multiple Imputed Data
Author :
Tuomo Alasalmi; Koskim?ki;Jaakko Suutala; R?ning
Author_Institution :
Data Anal. &
fYear :
2015
Firstpage :
151
Lastpage :
158
Abstract :
Every classification model contains uncertainty. This uncertainty can be distributed evenly or into certain areas of feature space. In regular classification tasks, the uncertainty can be estimated from posterior probabilities. On the other hand, if the data set contains missing values, not all classifiers can be used directly. Imputing missing values solves this problem but it suppresses variation in the data leading to underestimation of uncertainty and can also bias the results. Multiple imputation, where several copies of the data set are created, solves these problems but the classical approach for uncertainty estimation does not generalize to this case. Thus in this paper we propose a novel algorithm to estimate classification uncertainty with multiple imputed data. We show that the algorithm performs as well as the benchmark algorithm with a classifier that supports classification with missing values. It also supports the use of any classifier, even if it does not support classification with missing values, as long as it supports the estimation of posterior probabilities.
Keywords :
"Uncertainty","Data models","Support vector machines","Machine learning algorithms","Data handling","Correlation","Analytical models"
Publisher :
ieee
Conference_Titel :
Computational Intelligence, 2015 IEEE Symposium Series on
Print_ISBN :
978-1-4799-7560-0
Type :
conf
DOI :
10.1109/SSCI.2015.32
Filename :
7376605
Link To Document :
بازگشت