Title of article :
Ascertainment of the number of samples in the validation set in Monte Carlo cross validation and the selection of model dimension with Monte Carlo cross validation
Author/Authors :
Du، نويسنده , , Yi Ping and Kasemsumran، نويسنده , , Sumaporn and Maruo، نويسنده , , Katsuhiko and Nakagawa، نويسنده , , Takehiro and Ozaki، نويسنده , , Yukihiro، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2006
Pages :
7
From page :
83
To page :
89
Abstract :
Monte Carlo cross validation (MCCV) is used in two data sets including 125 and 1643 near-infrared (NIR) spectra of biological samples, respectively, to ascertain the number of samples left out for validation in MCCV and the dimension of PLS models consequently. With the selected number of samples in validation set, the suitable number of latent variables (LV) may be chosen correctly. The results obtained show that root mean squared error of calibration (RMSEC), root mean squared error of cross validation (RMSECV) and LV number are sensitive to the number of samples left out for validation when too many samples are left out. Based on this, RMSEC and RMSECV are suggested as criteria to assist the ascertainment of the number of samples left out for validation in MCCV. This method is easy and convenient to use. For a larger data set, more samples may be left out, but the suitable number of samples left out will decrease if the measurement error level is high.
Keywords :
Leave-one-out cross validation , Cross Validation , partial least squares , Near-infrared spectra , Monte Carlo cross validation
Journal title :
Chemometrics and Intelligent Laboratory Systems
Serial Year :
2006
Journal title :
Chemometrics and Intelligent Laboratory Systems
Record number :
1461622
Link To Document :
بازگشت