• DocumentCode
    3494009
  • Title

    Discriminative Hat Matrix: A new tool for outlier identification and linear regression

  • Author

    Dufrenois, F. ; Noyer, J.C.

  • Author_Institution
    SYVIP Team, LISIC, Calais, France
  • fYear
    2011
  • fDate
    July 31 2011-Aug. 5 2011
  • Firstpage
    777
  • Lastpage
    784
  • Abstract
    The hat matrix is an important auxiliary quantity in linear regression theory for detecting errors in predictors. Traditionally, the comparison of the diagonal elements with a calibration point serves as decision rule for separating a dominant linear population from outliers. However, several problems exist: first, the calibration point is not well defined because no exact statistical distribution (asymptotic form) of the hat matrix diagonal exists [1]. Secondly, being based on the standard covariance matrix, this outlying measure looses its efficiency when the rate of “atypical” observations becomes large [2][3]. In this paper, we present a discriminative version of the hat matrix (DHM) which transposes this classification problem into a subspace clustering problem. We propose a linear discriminant analysis based criterion directly built on the properties of the hat matrix and we show that its maximization leads to search an optimal projection subspace and an optimal indicator matrix. We also show that the statistic of the hat matrix diagonal “projected” on this optimal subspace has an exact X2 behaviour and thus makes it possible to identify outliers by way of hyptothesis testing. Synthetic data sets are used to study the performance both in terms of regression and classification of the proposed approach. We also illustrate its potential application to motion segmentation in image sequences.
  • Keywords
    covariance matrices; pattern classification; pattern clustering; regression analysis; atypical observations; classification problem; covariance matrix; discriminative hat matrix; dominant linear population; hyptothesis testing; image sequences; linear discriminant analysis; linear regression theory; motion segmentation; optimal indicator matrix; optimal projection subspace; outlier identification; predictor error detection; subspace clustering problem; Covariance matrix; Distributed databases; Eigenvalues and eigenfunctions; Linear regression; Matrix decomposition; Robustness; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), The 2011 International Joint Conference on
  • Conference_Location
    San Jose, CA
  • ISSN
    2161-4393
  • Print_ISBN
    978-1-4244-9635-8
  • Type

    conf

  • DOI
    10.1109/IJCNN.2011.6033300
  • Filename
    6033300