DocumentCode
2736833
Title
Stable stem enabled shannon entropies distinguish non-coding RNAs from random backgrounds
Author
Wang, Yingfeng ; Manzour, Amir ; Shareghi, Pooya ; Shaw, Timothy I. ; Li, Ying-Wai ; Malmberg, Russell L. ; Cai, Liming
Author_Institution
Dept. of Comput. Sci., Univ. of Georgia, Athens, GA, USA
fYear
2011
fDate
3-5 Feb. 2011
Firstpage
184
Lastpage
189
Abstract
The computational identification of RNAs in genomic sequences requires the identification of signals of RNA sequences. Shannon base pairing entropy is an indicator for RNA secondary structure folding certainty, in the detection of structural non-coding RNAs (ncRNAs). Under the Boltzmann ensemble of secondary structures, the probability of a base pair is estimated from its frequency across all the alternative equilibrium structures. However, such an entropy has yet to deliver the desired performance distinguishing ncRNAs from random sequences. Developing novel methods to improve the entropy measure performance may result in more effective ncRNA gene finding based on structure detection. This paper shows that the measuring performance of base pair entropy can be significantly improved with a constrained secondary structure ensemble in which only canonical base pairs are assumed to occur, and energetically stable stems are required, in a fold. This constraint actually reduces the space of the secondary structure and may lower probabilities of base pairs unfavorable to the native fold. Indeed, base pair entropies computed with this constrained model demonstrate substantially narrowed gaps of Z-scores between ncRNAs as well as drastic increases in the Z-score for all 13 tested ncRNA sets compared to shuffled sequences.
Keywords
biology computing; entropy; genetics; molecular biophysics; molecular configurations; Boltzmann ensemble; RNA secondary structure folding certainty; RNA sequences; Z-scores; base pair entropy; canonical base pairs; genomic sequences; ncRNA gene; random backgrounds; stable stem enabled Shannon entropy; structural noncoding RNA; structure detection; Computational modeling; Electronic mail; Entropy; Gallium; RNA; Random sequences; USA Councils; Boltzmann ensemble; RNA secondary structure; Shannon entropy; Z-score; base pair; base pair probability; stable stem; stochastic context-free grammar;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Advances in Bio and Medical Sciences (ICCABS), 2011 IEEE 1st International Conference on
Conference_Location
Orlando, FL
Print_ISBN
978-1-61284-851-8
Type
conf
DOI
10.1109/ICCABS.2011.5729876
Filename
5729876
Link To Document