Title of article
Complete validation for classification and class modeling procedures with selection of variables and/or with additional computed variables
Author/Authors
Forina، نويسنده , , M. and Oliveri، نويسنده , , P. and Casale، نويسنده , , M.، نويسنده ,
Issue Information
دوفصلنامه با شماره پیاپی سال 2010
Pages
13
From page
110
To page
122
Abstract
The evaluation of the predictive ability of a model, is an essential moment of all the chemometrical techniques. So it must be performed very carefully. However, in the case of selection of relevant variables (an essential step in the case of data sets with many, frequently thousands, variables) the selection is generally performed using all the available objects. In some recent classification and class modeling techniques, from the original or from the selected variables the Mahalanobis distances of the leverages from the centroids of the categories in the problem are computed, and then added to the original variables. Also here the Mahalanobis distances are computed with all the objects. The consequence is an overestimate of the prediction ability, very large when the ratio between the number of the objects and that of the variables is rather low, so that the variance-covariance matrix is unstable.
s paper the correct validation procedures are described for the cases of selection of variables and of the addition of Mahalanobis distances computed on the original variables or the selected variables. The estimates of the prediction ability are compared with those obtained with insufficient validation strategies.
Keywords
Class modeling , Validation , Classification
Journal title
Chemometrics and Intelligent Laboratory Systems
Serial Year
2010
Journal title
Chemometrics and Intelligent Laboratory Systems
Record number
1489787
Link To Document