Title of article :
Estimating the uncertainty in estimates of root mean square error of prediction: application to determining the size of an adequate test set in multivariate calibration
Author/Authors :
Faber، نويسنده , , Nicolaas (Klaas) M. Faber، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 1999
Abstract :
Root mean square error of prediction (RMSEP) is widely used as a criterion for judging the performance of a multivariate calibration model; often it is even the sole criterion. Two methods are discussed for estimating the uncertainty in estimates of RMSEP. One method follows from the approximate sampling distribution of mean square error of prediction (MSEP) while the other one is based on performing error propagation, which is a distribution-free approach. The results from a small Monte Carlo simulation study suggest that, provided that extreme outliers are removed from the test set, MSEP estimates are approximately proportional to a χ2 random variable with n degrees of freedom, where n is the number of samples in the test set. It is detailed how this knowledge can be used to determine the size of an adequate test set. The advantages over the guideline issued by the American Society for Testing and Materials (ASTM) are discussed. The expression derived by the method of error propagation is shown to systematically overestimate the true uncertainty. A correction factor is introduced to ensure approximate correct behaviour. A close agreement is found between the uncertainties calculated using the two complementary methods. The consequences of using a too small test set are illustrated on a practical data set.
Keywords :
Multivariate calibration , Test set , RMSEP , Monte Carlo simulation , error propagation , Distribution , ASTM
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems