DocumentCode :
3408827
Title :
A hierarchical mixture of Markov models for finding biologically active metabolic paths using gene expression and protein classes
Author :
Mamitsuka, Hiroshi ; Okuno, Yasushi
Author_Institution :
Inst. for Chem. Res., Kyoto Univ., Uji, Japan
fYear :
2004
fDate :
16-19 Aug. 2004
Firstpage :
341
Lastpage :
352
Abstract :
With the recent development of experimental high-throughput techniques, the type and volume of accumulating biological data have extremely increased these few years. Mining from different types of data might lead us to find new biological insights. We present a new methodology for systematically combining three different datasets to find biologically active metabolic paths/patterns. This method consists of two steps: first it synthesizes metabolic paths from a given set of chemical reactions, which are already known and whose enzymes are co-expressed, in an efficient manner. It then represents the obtained metabolic paths in a more comprehensible way through estimating parameters of a probabilistic model by using these synthesized paths. This model is built upon an assumption that an entire set of chemical reactions corresponds to a Markov state transition diagram. Furthermore, this model is a hierarchical latent variable model, containing a set of protein classes as a latent variable, for clustering input paths in terms of existing knowledge of protein classes. We tested the performance of our method using a main pathway of glycolysis, and found that our method achieved higher predictive performance for the issue of classifying gene expressions than those obtained by other unsupervised methods. We further analyzed the estimated parameters of our probabilistic models, and found that biologically active paths were clustered into only two or three patterns for each expression experiment type, and each pattern suggested some new long-range relations in the glycolysis pathway.
Keywords :
Markov processes; biochemistry; biology computing; data mining; enzymes; genetics; molecular biophysics; parameter estimation; physiological models; Markov models; biologically active metabolic paths; chemical reactions; clustering; data mining; enzymes; gene expression; gene expressions; glycolysis; hierarchical latent variable model; parameter estimation; probabilistic model; protein classes; Biochemistry; Biological system modeling; Biology; Chemical compounds; Databases; Gene expression; Parameter estimation; Pharmaceuticals; Protein engineering; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN :
0-7695-2194-0
Type :
conf
DOI :
10.1109/CSB.2004.1332447
Filename :
1332447
Link To Document :
بازگشت