Title of article :
Small-sample and selection bias effects in multivariate calibration, exemplified for OLS and PLS regressions
Author/Authors :
Sundberg، نويسنده , , Rolf، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2006
Abstract :
In multivariate calibration by for example ordinary least squares (OLS) multiple regression or partial least squares regression (PLSR) the predictor ŷ(x) is perfect for the calibration sample itself, in the sense that the regression of observed y on predicted ŷ(x) is y = ŷ(x). Plots of y against ŷ(x) are much used to illustrate how good the calibration is and how well the prediction works. Usually and rightly, this will be combined with cross-validation. In particular, cross-validation can show that for small-samples the predictor ŷ(x) will be biased, in the sense that making the regression coefficient of y on ŷ(x) less than one, typically only slightly so for PLSR but substantially for OLSR. Another bias effect appears when y-values for the calibration are more or less selected. An increase in the spread of y might appear desirable because it increases the precision in the calibration. However, the resulting selection bias can affect both PLSR and OLSR substantially, and an additional problem with this bias is that it cannot be detected by cross-validation. These bias effects will here be illustrated by resampling from a large multivariate data-set, containing measurements on 344 pigs from slaughter pig grading.
Keywords :
PLSR , representativity , bias , cross-validation , OLS , Multivariate calibration , Prediction
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems