Title of article :
Model selection for partial least squares regression
Author/Authors :
Li، نويسنده , , Baibing and Morris، نويسنده , , Julian and Martin، نويسنده , , Elaine B.، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2002
Abstract :
Partial least squares (PLS) regression is a powerful and frequently applied technique in multivariate statistical process control when the process variables are highly correlated. Selection of the number of latent variables to build a representative model is an important issue. A metric frequently used by chemometricians for the determination of the number of latent variables is that of Woldʹs R criterion, whilst more recently a number of statisticians have advocated the use of Akaike Information Criterion (AIC). In this paper, a comparison between Woldʹs R criterion and AIC for the selection of the number of latent variables to include in a PLS model that will form the basis of a multivariate statistical process control representation is undertaken based on a simulation study. It is shown that neither Woldʹs R criterion nor AIC exhibit satisfactory performance. This is in contrast to the adjusted Woldʹs R criteria which is shown to demonstrate satisfactory performance in terms of the number of times the known true model is selected. Two industrial applications are then used to demonstrate the methodology. The first relates to the modelling of a product quality using data from an industrial fluidised bed reactor and the second focuses on an industrial NIR data set. The results are consistent with those of the simulation studies.
Keywords :
Model selection , multivariate statistical process control , Woldיs R criterion , Akaike Information Criterion (AIC) , cross-validation , Partial least squares regression
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems