Title of article
Feature selection in principal component analysis of analytical data
Author/Authors
Guo، نويسنده , , Q and Wu، نويسنده , , W and Massart، نويسنده , , D.L and Boucon، نويسنده , , C and de Jong، نويسنده , , S، نويسنده ,
Issue Information
دوفصلنامه با شماره پیاپی سال 2002
Pages
10
From page
123
To page
132
Abstract
A feature selection method is proposed to select a subset of variables in principal component analysis (PCA) that preserves as much information present in the complete data as possible. The information is measured by means of the percentage of consensus in generalised Procrustes analysis. The best subset of variables is obtained by applying a genetic algorithm (GA) to optimise the consensus between the subset and the complete data set in order to avoid exhaustive searching. The method was evaluated on a standard data set known as the Alate data, and on a high-dimensional industrial gas chromatography (GC) data set. The results showed that the proposed method successfully identified structure-bearing variables in both data sets and that it leads to a better subset of variables than other studied feature selection methods.
Keywords
feature selection , genetic algorithm , Generalised Procrustes analysis , DATA MINING , Gas chromatography , Principal component analysis
Journal title
Chemometrics and Intelligent Laboratory Systems
Serial Year
2002
Journal title
Chemometrics and Intelligent Laboratory Systems
Record number
1460560
Link To Document