• DocumentCode
    1519612
  • Title

    Missing Value Estimation for Mixed-Attribute Data Sets

  • Author

    Zhu, Xiaofeng ; Zhang, Shichao ; Jin, Zhi ; Zhang, Zili ; Xu, Zhuoming

  • Author_Institution
    Sch. of Inf. Technol. & Electr. Eng., Univ. of Queensland, Brisbane, QLD, Australia
  • Volume
    23
  • Issue
    1
  • fYear
    2011
  • Firstpage
    110
  • Lastpage
    121
  • Abstract
    Missing data imputation is a key issue in learning from incomplete data. Various techniques have been developed with great successes on dealing with missing values in data sets with homogeneous attributes (their independent attributes are all either continuous or discrete). This paper studies a new setting of missing data imputation, i.e., imputing missing data in data sets with heterogeneous attributes (their independent attributes are of different types), referred to as imputing mixed-attribute data sets. Although many real applications are in this setting, there is no estimator designed for imputing mixed-attribute data sets. This paper first proposes two consistent estimators for discrete and continuous missing target values, respectively. And then, a mixture-kernel-based iterative estimator is advocated to impute mixed-attribute data sets. The proposed method is evaluated with extensive experiments compared with some typical algorithms, and the result demonstrates that the proposed approach is better than these existing imputation methods in terms of classification accuracy and root mean square error (RMSE) at different missing ratios.
  • Keywords
    data mining; estimation theory; iterative methods; learning (artificial intelligence); mean square error methods; missing data imputation; missing value estimation; mixed attribute data set; mixture kernel based iterative estimator; root mean square error; Bibliographies; Data mining; Databases; Information science; Iterative algorithms; Iterative methods; Kernel; Machine learning; Machine learning algorithms; Root mean square; Classification; data mining; machine learning.; methodologies;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.99
  • Filename
    5487520