Title :
Application of hidden Markov modeling to the characterization of transcription factor binding sites
Author :
Raman, Rajasekhar ; Overton, G. Christian
Author_Institution :
Sch. of Med., Pennsylvania Univ., Philadelphia, PA, USA
Abstract :
The regulation of gene transcription in eukaryotes, though not well understood, is known to involve sequence specific recognition and binding of short DNA sequences (transcription elements) by regulatory proteins (transcription factors). The cis acting transcription elements can be found at a considerable distance upstream or downstream from the gene they control and are often orientation independent. The transcriptional state of a gene is thought to require the formation of a "transcription complex" where multiple elements and factors interact through DNA-protein and protein-protein binding events. As a step towards elucidating the process of transcription complex formation, the authors have developed a model of transcription factor-DNA binding based on Hidden Markov Models (HMM). The parameters of the HMM are probabilities from which an entropy measure can be calculated that characterizes the information requirements of the binding process. Alignment of the set of transcription elements against the HMM also yields information regarding the conserved positions recognized by the transcription factor. Finally, they examine the relationship between the entropy of the model and the accuracy of the model when used for pattern recognition: in general, pattern recognition efficient when the conservation of bases in the sites is fairly high.<>
Keywords :
DNA; biology; entropy; hidden Markov models; pattern recognition; DNA sequences; eukaryotes; gene transcription; hidden Markov modeling; recognition; regulatory proteins; transcription complex formation; transcription elements; transcription factor binding sites;
Conference_Titel :
System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on
Conference_Location :
Wailea, HI, USA
Print_ISBN :
0-8186-5090-7
DOI :
10.1109/HICSS.1994.323569