• DocumentCode
    3726481
  • Title

    Classification Uncertainty of Multiple Imputed Data

  • Author

    Tuomo Alasalmi; Koskim?ki;Jaakko Suutala; R?ning

  • Author_Institution
    Data Anal. &
  • fYear
    2015
  • Firstpage
    151
  • Lastpage
    158
  • Abstract
    Every classification model contains uncertainty. This uncertainty can be distributed evenly or into certain areas of feature space. In regular classification tasks, the uncertainty can be estimated from posterior probabilities. On the other hand, if the data set contains missing values, not all classifiers can be used directly. Imputing missing values solves this problem but it suppresses variation in the data leading to underestimation of uncertainty and can also bias the results. Multiple imputation, where several copies of the data set are created, solves these problems but the classical approach for uncertainty estimation does not generalize to this case. Thus in this paper we propose a novel algorithm to estimate classification uncertainty with multiple imputed data. We show that the algorithm performs as well as the benchmark algorithm with a classifier that supports classification with missing values. It also supports the use of any classifier, even if it does not support classification with missing values, as long as it supports the estimation of posterior probabilities.
  • Keywords
    "Uncertainty","Data models","Support vector machines","Machine learning algorithms","Data handling","Correlation","Analytical models"
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence, 2015 IEEE Symposium Series on
  • Print_ISBN
    978-1-4799-7560-0
  • Type

    conf

  • DOI
    10.1109/SSCI.2015.32
  • Filename
    7376605