DocumentCode :
2448404
Title :
Parameterization studies of hidden Markov models representing highly divergent protein sequences
Author :
McClure, Marcella A. ; Raman, Rajasekhar
Author_Institution :
Dept. of Biol. Sci., Nevada Univ., Las Vegas, NV, USA
Volume :
5
fYear :
1995
fDate :
3-6 Jan 1995
Firstpage :
184
Abstract :
Complex genome analysis is the study of nucleic acid and protein sequences to further the understanding of the molecular evolutionary mechanisms and frequency of events instrumental in the construction of genomes. The corner stone of these studies is the multiple alignment of homologous sequences. To date no method exists that can correctly identify the most conserved features of distantly related proteins without refinement by human pattern recognition skills. Recent application of HMM approaches to the problem of multiple protein sequence alignment offers a new method of analysis. The quality of the alignment produced by an HMM is dependent on the quality of the model itself. We measure the quality of a model by the correspondence between the optimal model, the highest average entropylposition model, and the biologically informative model, which by definition is the one that captures a specific set of biological features common to a protein family. The studies reported here on the effect of model length and training set size suggest that both play a critical role in generating biologically informative HMMs
Keywords :
DNA; biology computing; genetics; hidden Markov models; HMM approaches; biological features; biologically informative HMMs; biologically informative model; complex genome analysis; distantly related proteins; genetics; hidden Markov models; highest average entropylposition model; highly divergent protein sequences; homologous sequences; model length; molecular evolutionary mechanisms; multiple protein sequence alignment; nucleic acid; optimal model; parameterization studies; protein family; protein sequences; training set size; Bioinformatics; Biological information theory; Biological system modeling; DNA; Genomics; Hidden Markov models; Proteins; RNA; Sequences; Viruses (medical);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Sciences, 1995. Proceedings of the Twenty-Eighth Hawaii International Conference on
Conference_Location :
Wailea, HI
Print_ISBN :
0-8186-6930-6
Type :
conf
DOI :
10.1109/HICSS.1995.375338
Filename :
375338
Link To Document :
بازگشت