Title of article :
Multivariate outlier detection and remediation in
geochemical databases
Author/Authors :
Gerald C. Lalor، نويسنده , , Chaosheng Zhang، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2001
Abstract :
In this study, outliers are classified into three types: Ž1. range outliers; Ž2. spatial outliers; and Ž3. relationship
outliers, defined as observations that fall outside of the values expected from correlation within the dataset. The
multivariate methods of principal component analysis ŽPCA., multiple regression analysis ŽMRA. and an autoassociation
neural network ŽAutoNN. method are applied to a dataset comprising 203 samples of rare earth element ŽREE.
concentrations in soils of Jamaica which shows the expected good correlations between the elements. PCA is shown
to be effective in detection of high value range outliers, while AutoNN and MRA are effective in detection of
relationship outliers. A backpropagation neural network was used to predict the ‘expected values’ of the outliers.
Four obvious relationship outliers with unexpected low Sm concentrations were selected as an example for
remediation. The predicted Sm values were confirmed on remeasurement. Neural network methods, with the
advantages of being model-free and effective in solving non-linear relationship problems, appear to provide an
automated and effective way for the quality control of environmental databases
Keywords :
Rareearth elements , outlier , Database , quality control , Principal component analysis , Multiple regression analysis , neural network
Journal title :
Science of the Total Environment
Journal title :
Science of the Total Environment