Title of article :
Circular effects in representations of an RNA nucleotides data set in relation with principal components analysis
Author/Authors :
Reijmers، نويسنده , , T.H and Wehrens، نويسنده , , R and Buydens، نويسنده , , L.M.C، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2001
Abstract :
During the last few years, the main reason for using molecular structure databases has changed. Instead of using databases as a storage medium, databases now are also used as a source for data-mining applications. The large number of objects and variables in these databases induced that besides univariate techniques, multivariate techniques are also applied to search for knowledge hidden in the data. A popular multivariate technique that is used to explore the underlying structure in data is called principal component analysis (PCA). Because structure data are often represented as torsion angles and PCA is not originally designed to deal with this kind of circular data, the outcome of PCA experiments can be misleading. This article describes several alternative representations of circular data and its effect on the outcome of PCA experiments. A worked example is given using a database of RNA nucleotides.
Keywords :
DATA MINING , RNA nucleotides , Multivariate analysis , PCA
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems