Title :
Stochastic context-free grammars for modeling RNA
Author :
Sakakibara, Yasubumi ; Brown, Michael ; Underwood, Rebecca C. ; Mian, I. Saira ; Haussler, David
Author_Institution :
Dept. of Comput. & Inf. Sci., California Univ., Santa Cruz, CA, USA
Abstract :
Stochastic context-free grammars (SCFGs) are used to fold, align and model a family of homologous RNA sequences. SCFGs capture the sequences´ common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. The novel aspect of this work is that SCFG parameters are learned automatically from unaligned, unfolded training sequences. A generalization of the HMM forward-backward algorithm is introduced The new algorithm, based on tree grammars and faster than the previously proposed SCFG inside-outside algorithm, is tested on the transfer RNA (tRNA) family. Results show the model can discern tRNA from similar-length RNA sequences, can find secondary structure of new tRNA sequences, and can give multiple alignments of large sets of tRNA sequences. The model is extended to handle introns in tRNA.<>
Keywords :
biology; context-free grammars; hidden Markov models; pattern recognition; DNA; RNA; context-free grammars; database searching; hidden Markov models; homologous RNA sequences; multiple alignments; multiple sequence alignments; protein; stochastic context-free grammars; transfer RNA; tree grammars;
Conference_Titel :
System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on
Conference_Location :
Wailea, HI, USA
Print_ISBN :
0-8186-5090-7
DOI :
10.1109/HICSS.1994.323568