Title of article :
Discarding or downweighting high-noise variables in factor analytic models Original Research Article
Author/Authors :
Pentti Paatero، نويسنده , , Philip K. Hopke، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2003
Abstract :
This work examines the factor analysis of matrices where the proportion of signal and noise is very different in different columns (variables). Such matrices often occur when measuring elemental concentrations in environmental samples. In the strongest variables, the error level may be a few percent. For the weakest variables, the data may consist almost entirely of noise. This paper demonstrates that the proper scaling of weak variables is critical. It is found that if a few weak variables are scaled to too high a weight in the analysis, the errors in computed factors would grow, possibly obscuring the weakest factor(s) by the increased noise level. The mathematical explanation of this phenomenon is explored by means of Givens rotations. It is shown that the customary form of principal component analysis (PCA), based on autoscaling the original data, is generally very ineffective because the scaling of weak variables becomes much too high. Practical advice is given for dealing with noisy data in both PCA and positive matrix factorization (PMF).
Keywords :
Principal component analysis , Signal-to-noise , Positive matrix factorization , Autoscaling , Givens rotations , Weak variables , Scaling of variables
Journal title :
Analytica Chimica Acta
Journal title :
Analytica Chimica Acta