Title of article
Cross model validation and optimisation of bilinear regression models
Author/Authors
Gidskehaug، نويسنده , , Lars and Anderssen، نويسنده , , Endre and Alsberg، نويسنده , , Bjّrn K.، نويسنده ,
Issue Information
دوفصلنامه با شماره پیاپی سال 2008
Pages
10
From page
1
To page
10
Abstract
Whenever regression models are optimised, it is important that all optimisation steps are properly validated. Variable selection is one example of parameter estimation that will give overly optimistic models if not included in the validation. There are many examples of reported work where the validation is performed posterior to variable selection, and many have correctly noted that these models are optimistically biased. However, if the availability of samples is limited, separation of the data into a training and validation set may decrease the quality of both the calibration model and the validation. Cross model validation is designed to validate the optimisation by including the variable selection in an extra layer of cross-validation. This means that all available samples are utilised both in the training and for estimating the residual error of the model.
model validation poses challenging questions both conceptually and algorithmically, and a presentation of the full work-flow is needed. We present a complete framework including optimisation, validation and calibration of bilinear regression models with variable selection. Several issues are addressed that are important for each separate stage of the analysis, and suggestions for improvements are proposed. The method is validated on a gene expression data set with a low signal-to-noise ratio and a small number of samples. It is shown that many replicates are needed to model these data properly, and that cross model validated variable selection improves both the final calibration model and the associated error estimates. A Matlab toolbox (Mathworks Inc, USA) is available from www.specmod.org.
Keywords
variable selection , backward elimination , Microarray data , Cross model validation , Partial least squares regression , PLSR
Journal title
Chemometrics and Intelligent Laboratory Systems
Serial Year
2008
Journal title
Chemometrics and Intelligent Laboratory Systems
Record number
1489311
Link To Document