• DocumentCode
    1350851
  • Title

    Accuracy of Pseudo-Inverse Covariance Learning—A Random Matrix Theory Analysis

  • Author

    Hoyle, David C.

  • Author_Institution
    Fac. of Life Sci., Univ. of Manchester, Manchester, UK
  • Volume
    33
  • Issue
    7
  • fYear
    2011
  • fDate
    7/1/2011 12:00:00 AM
  • Firstpage
    1470
  • Lastpage
    1481
  • Abstract
    For many learning problems, estimates of the inverse population covariance are required and often obtained by inverting the sample covariance matrix. Increasingly for modern scientific data sets, the number of sample points is less than the number of features and so the sample covariance is not invertible. In such circumstances, the Moore-Penrose pseudo-inverse sample covariance matrix, constructed from the eigenvectors corresponding to nonzero sample covariance eigenvalues, is often used as an approximation to the inverse population covariance matrix. The reconstruction error of the pseudo-inverse sample covariance matrix in estimating the true inverse covariance can be quantified via the Frobenius norm of the difference between the two. The reconstruction error is dominated by the smallest nonzero sample covariance eigenvalues and diverges as the sample size becomes comparable to the number of features. For high-dimensional data, we use random matrix theory techniques and results to study the reconstruction error for a wide class of population covariance matrices. We also show how bagging and random subspace methods can result in a reduction in the reconstruction error and can be combined to improve the accuracy of classifiers that utilize the pseudo-inverse sample covariance matrix. We test our analysis on both simulated and benchmark data sets.
  • Keywords
    covariance matrices; eigenvalues and eigenfunctions; learning (artificial intelligence); Frobenius norm; covariance eigenvalues; inverse population covariance; pseudo-inverse covariance learning; random matrix theory analysis; reconstruction error; Accuracy; Bagging; Covariance matrix; Eigenvalues and eigenfunctions; Machine learning; Simulation; Upper bound; Pseudo-inverse; bagging; linear discriminants; peaking phenomenon; random matrix theory; random subspace method.;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2010.186
  • Filename
    5601741