Title of article
Cluster resolution: A metric for automated, objective and optimized feature selection in chemometric modeling
Author/Authors
Nikolai A. Sinkov، نويسنده , , Nikolai A. and Harynuk، نويسنده , , James J.، نويسنده ,
Issue Information
ماهنامه با شماره پیاپی سال 2011
Pages
9
From page
1079
To page
1087
Abstract
A novel metric termed cluster resolution is presented. This metric compares the separation of clusters of data points while simultaneously considering the shapes of the clusters and their relative orientations. Using cluster resolution in conjunction with an objective variable ranking metric allows for fully automated feature selection for the construction of chemometric models. The metric is based upon considering the maximum size of confidence ellipses around clusters of points representing different classes of objects that can be constructed without any overlap of the ellipses. For demonstration purposes we utilized PCA to classify samples of gasoline based upon their octane rating. The entire GC–MS chromatogram of each sample comprising over 2 × 106 variables was considered. As an example, automated ranking by ANOVA was applied followed by a forward selection approach to choose variables for inclusion. This approach can be generally applied to feature selection for a variety of applications and represents a significant step towards the development of fully automated, objective construction of chemometric models.
Keywords
Chemometrics , PCA , ANOVA , GC–MS , Gasoline , Cluster resolution , feature selection
Journal title
Talanta
Serial Year
2011
Journal title
Talanta
Record number
1661466
Link To Document