Title of article :
Outliers in partial least squares regression: Application to calibration of wine grade with mean infrared data Original Research Article
Author/Authors :
R. Llet?، نويسنده , , E. Meléndez، نويسنده , , M.C. Ortiz، نويسنده , , L.A. Sarabia، نويسنده , , M.S. S?nchez، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2005
Abstract :
The process control of the elaboration of wines as well as the final quality of the product is at present incorporating non-destructive methods for the analysis so that they can be systematically applied anywhere in the process. MIR spectroscopy is an easy, fast and reproducible technique that allows obtaining several parameters from the same spectrum and even the calibration transfer among different instruments. Therefore, MIR spectroscopy is being routinely used in many oenological stations and wine cellars.
As it provides non specific signals, a multivariate calibration is mandatory, usually by partial least squares regression (PLSR). However, wine samples present high variability due to their origin (varieties of vineyard, land, climate, etc.) and to the elaboration process, so that it is necessary to work with large number of samples possibly with important dissimilarities among them. This work studies some of these aspects by using 816 samples of wine representative of seven Spanish Denominations of Origin and measured at the Oenological Station of Haro, the official laboratory of the Qualified Denomination of Origin ‘Rioja’.
The fact is that in the stage of building and validating a PLS regression model it is possible to identify samples that present spectral “abnormalities” and/or abnormalities in the response, but when using the built model with new spectra for predicting it is only possible to detect samples that present spectral abnormalities. In this context, it seems obvious that what we call “spectral abnormality” depends on the set used for training but also on the calibration itself, i.e. on the analytical response being modelled with the spectra. Thus, the paper is devoted to study the possibility of detecting by only using the (dis)similarity among spectra those samples declared abnormal in the successive steps of construction and validation of a partial least squares regression.
The results shown are obtained by only modelling the alcoholic grade of the wines but the proposed solution is methodological and can be extended to calibrate any other parameter. The main conclusion is that on the studied data set, the samples declared abnormal (outliers) in the calibration step are not detected by only using the spectral (dis)similarity. However, the analysis of spectra shows the presence of a set of wines, not related with the outliers detected by PLS, whose presence in the calibration set is necessary to guarantee that the PLSR model built can be applied to future samples.
Keywords :
Genetic Algorithm , Mean infrared spectroscopy , MIR , Alcoholic grade , Partial Least Squares regression , outlier , Contingence table , cluster analysis , Wine
Journal title :
Analytica Chimica Acta
Journal title :
Analytica Chimica Acta