Title of article :
Mining of Biological Data II : Assessing Data Structure and Class Homogeneity by Cluster Analysis
Author/Authors :
Kamimura، نويسنده , , Roy T. and Bicciato، نويسنده , , Silvio and Shimizu، نويسنده , , Hiroshi and Alford، نويسنده , , Joe and Stephanopoulos، نويسنده , , Gregory، نويسنده ,
Issue Information :
دوماهنامه با شماره پیاپی سال 2000
Pages :
11
From page :
228
To page :
238
Abstract :
An important step in data analysis is class assignment which is usually done on the basis of a macroscopic phenotypic or bioprocess characteristic, such as high vs low growth, healthy vs diseased state, or high vs low productivity. Unfortunately, such an assignment may lump together samples, which when derived from a more detailed phenotypic or bioprocess description are dissimilar, giving rise to models of lower quality and predictive power. In this paper we present a clustering algorithm for data preprocessing which involves the identification of fundamentally similar lots on the basis of the extent of similarity among the system variables. The algorithm combines aspects of cluster analysis and principal component analysis by applying agglomerative clustering methods to the first principal component of the system data matrix. As part of a rational strategy for developing empirical models, this technique selects lots (samples) which are most appropriate for inclusion in a training set by analyzing multivariate data homogeneity. Samples with similar data structures are identified and grouped together into distinct clusters. This knowledge is used in the formation of potential training sets. Additionally, this technique can identify atypical lots, i.e., samples that are not simply outliers but exhibit the general properties of one class but have been given the assignment of the other. The method is presented along with examples from its application to fermentation data sets.
Journal title :
Metabolic Engineering
Serial Year :
2000
Journal title :
Metabolic Engineering
Record number :
1428252
Link To Document :
بازگشت