• DocumentCode
    553086
  • Title

    Imputation of missing data using ensemble algorithms

  • Author

    Xiaoling Lu ; Jiesheng Si ; Lanfeng Pan ; Yanyun Zhao

  • Author_Institution
    Center for Appl. Stat., Renmin Univ. of China, Beijing, China
  • Volume
    2
  • fYear
    2011
  • fDate
    26-28 July 2011
  • Firstpage
    1312
  • Lastpage
    1315
  • Abstract
    Missing data or incomplete data are very common in statistical situations. One way to deal with missing data is to conduct model imputation either one time or multiple times. One of the key problems in analyzing the imputed dataset is to give the valid statistical reference of the parameter estimated, that is, to give a right estimation of the standard error of the interested statistic. This paper proposes the new developed ensemble algorithms as imputation model. In order to realize multiple imputation, we suggest bootstrap sampling the prediction error several times. The properties of the proposed methods are studied by simulation and compared with existing methods. Finally, the methods are applied to analyze one real large dataset, taking the missing mechanism into consideration.
  • Keywords
    data analysis; learning (artificial intelligence); parameter estimation; sampling methods; bootstrap sampling; ensemble algorithms; incomplete data; missing data imputation; multiple imputation; parameter estimation; statistical reference; statistical situation; supervised learning; Bagging; Boosting; Data models; Educational institutions; Estimation; Prediction algorithms; Predictive models; ensemble algorithm; imputation; missing data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-61284-180-9
  • Type

    conf

  • DOI
    10.1109/FSKD.2011.6019647
  • Filename
    6019647