• DocumentCode
    996738
  • Title

    On Classification with Incomplete Data

  • Author

    Williams, David ; Liao, Xuejun ; Xue, Ya ; Carin, Lawrence ; Krishnapuram, Balaji

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Duke Univ., Durham, NC
  • Volume
    29
  • Issue
    3
  • fYear
    2007
  • fDate
    3/1/2007 12:00:00 AM
  • Firstpage
    427
  • Lastpage
    436
  • Abstract
    We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both expectation-maximization (EM) and variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graph-based regularization. The semisupervised algorithm utilizes all available data-both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown
  • Keywords
    Bayes methods; Gaussian processes; expectation-maximisation algorithm; feature extraction; pattern classification; regression analysis; Gaussian mixture model; data classification; expectation-maximization; feature vectors; incomplete-data problem; parameter estimation; supervised logistic regression algorithm; variational Bayesian EM; Acoustic sensors; Bayesian methods; Classification algorithms; Density functional theory; Infrared sensors; Logistics; Parameter estimation; Performance analysis; Remote sensing; Supervised learning; Classification; imperfect labeling.; incomplete data; missing data; semisupervised learning; supervised learning; Algorithms; Artificial Intelligence; Computer Simulation; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Logistic Models; Pattern Recognition, Automated; Reproducibility of Results; Sample Size; Sensitivity and Specificity;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2007.52
  • Filename
    4069259