Title of article :
Practical approaches to principal component analysis for simultaneously dealing with missing and censored elements in chemical data Original Research Article
Author/Authors :
I. Stanimirova، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2013
Abstract :
Multivariate chemical data often contain elements that are missing completely at random and the so-called left-censored elements whose values are only known to be below a definite threshold value (reporting limit). In the last several years, attention has been paid to developing methods for dealing with data containing missing elements and those that can handle data with missing elements and outliers. However, processing data with both missing and left-censored elements is still an ongoing problem.
The aim of this work was to investigate which method is most suitable for handling left-censored and missing completely at random elements that are present simultaneously in chemical data by using a comparison of the generalized nonlinear iterative partial least squares (NIPALS) algorithm that has been recently proposed, methods that include uncertainty information like maximum likelihood principal component analysis, MLPCA, and replacement methods.
The results of the Monte Carlo simulation study for artificial and real data sets showed that substitution with half of the reporting limit can be used when the percentage of left-censored elements per variable is up to 30–40%. The generalized NIPALS algorithm is generally recommended for a large percentage of left-censored elements per variable and particularly when a large number of variables are censored. The expectation-maximization approach applied to data with censored elements substituted with half of the reporting limits can be a strategy for dealing with missing and left-censored elements in data, but if the converge criterion is not fulfilled, then the generalized NIPALS algorithm can be applied.
Keywords :
Expectation-maximization algorithm , Maximum likelihood principal component analysis , Positive matrix factorization , Left-censored data , Generalized nonlinear iterative partial least squares algorithm
Journal title :
Analytica Chimica Acta
Journal title :
Analytica Chimica Acta