Title of article :
Feature selection in principal component analysis of analytical data
Author/Authors :
Guo، نويسنده , , Q and Wu، نويسنده , , W and Massart، نويسنده , , D.L and Boucon، نويسنده , , C and de Jong، نويسنده , , S، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2002
Abstract :
A feature selection method is proposed to select a subset of variables in principal component analysis (PCA) that preserves as much information present in the complete data as possible. The information is measured by means of the percentage of consensus in generalised Procrustes analysis. The best subset of variables is obtained by applying a genetic algorithm (GA) to optimise the consensus between the subset and the complete data set in order to avoid exhaustive searching. The method was evaluated on a standard data set known as the Alate data, and on a high-dimensional industrial gas chromatography (GC) data set. The results showed that the proposed method successfully identified structure-bearing variables in both data sets and that it leads to a better subset of variables than other studied feature selection methods.
Keywords :
feature selection , genetic algorithm , Generalised Procrustes analysis , DATA MINING , Gas chromatography , Principal component analysis
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems