مرکز منطقه ای اطلاع رساني علوم و فناوري - Discriminative Hat Matrix: A new tool for outlier identification and linear regression

DocumentCode :

3494009

Title :

Discriminative Hat Matrix: A new tool for outlier identification and linear regression

Author :

Dufrenois, F. ; Noyer, J.C.

Author_Institution :

SYVIP Team, LISIC, Calais, France

fYear :

2011

fDate :

July 31 2011-Aug. 5 2011

Firstpage :

777

Lastpage :

784

Abstract :

The hat matrix is an important auxiliary quantity in linear regression theory for detecting errors in predictors. Traditionally, the comparison of the diagonal elements with a calibration point serves as decision rule for separating a dominant linear population from outliers. However, several problems exist: first, the calibration point is not well defined because no exact statistical distribution (asymptotic form) of the hat matrix diagonal exists [1]. Secondly, being based on the standard covariance matrix, this outlying measure looses its efficiency when the rate of “atypical” observations becomes large [2][3]. In this paper, we present a discriminative version of the hat matrix (DHM) which transposes this classification problem into a subspace clustering problem. We propose a linear discriminant analysis based criterion directly built on the properties of the hat matrix and we show that its maximization leads to search an optimal projection subspace and an optimal indicator matrix. We also show that the statistic of the hat matrix diagonal “projected” on this optimal subspace has an exact X² behaviour and thus makes it possible to identify outliers by way of hyptothesis testing. Synthetic data sets are used to study the performance both in terms of regression and classification of the proposed approach. We also illustrate its potential application to motion segmentation in image sequences.

Keywords :

covariance matrices; pattern classification; pattern clustering; regression analysis; atypical observations; classification problem; covariance matrix; discriminative hat matrix; dominant linear population; hyptothesis testing; image sequences; linear discriminant analysis; linear regression theory; motion segmentation; optimal indicator matrix; optimal projection subspace; outlier identification; predictor error detection; subspace clustering problem; Covariance matrix; Distributed databases; Eigenvalues and eigenfunctions; Linear regression; Matrix decomposition; Robustness; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks (IJCNN), The 2011 International Joint Conference on

Conference_Location :

San Jose, CA

ISSN :

2161-4393

Print_ISBN :

978-1-4244-9635-8

Type :

conf

DOI :

10.1109/IJCNN.2011.6033300

Filename :

6033300

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3494009