DocumentCode :
3138801
Title :
Evaluation of RNA Secondary Structure Motifs using Regression Analysis
Author :
Anwar, Mohammad ; Turcotte, Marcel
Author_Institution :
Sch. of Inf. Technol. & Eng., Ottawa Univ., Ont.
fYear :
2006
fDate :
38838
Firstpage :
1747
Lastpage :
1752
Abstract :
Recent experimental evidences have shown that ribonucleic acid (RNA) plays a greater role in the cell than previously thought. An ensemble of RNA sequences believed to contain signals at the structure level can be exploited to detect functional motifs common to all or a portion of those sequences. We present here a general framework for analyzing multiple RNA secondary structures. A family of related RNA structures may be analyzed using statistical regression methods. In this work, we extend our previously developed algorithm, seed, that allows to explore exhaustively the search space of RNA sequence and structure motifs. We introduce here several objective functions based on thermodynamic free energy and information content to discriminate native folds from the rest. We assume that the variation across the various scores can be represented by a statistical model. Regression analysis permits to assign separate weight for each score, allowing one to emphasize or compensate the variance that differs across the different scores. A statistical model can be formulated using techniques from regression analysis to obtain a template or scoring model that is able to identify putative functional regions in RNA sequences. We show that thermodynamic based regression models are effective to associate the variation of scores obtained from different functions. The models can generally identify motifs with high measures of specificity and positive predicted value to known motifs. A good scoring method will allow to eliminate invalid motifs thereby reducing the size of the hypothesis space
Keywords :
biology computing; cellular biophysics; molecular biophysics; organic compounds; regression analysis; RNA secondary structure motif; RNA sequence; putative functional region; ribonucleic acid; search space; statistical regression method; thermodynamic free energy; Accuracy; Biological information theory; Biology computing; Genetics; Information technology; Nearest neighbor searches; RNA; Regression analysis; Sequences; Thermodynamics; Motif discovery; linear regression; ribonucleic acid; secondary structure; thermodynamics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical and Computer Engineering, 2006. CCECE '06. Canadian Conference on
Conference_Location :
Ottawa, Ont.
Print_ISBN :
1-4244-0038-4
Electronic_ISBN :
1-4244-0038-4
Type :
conf
DOI :
10.1109/CCECE.2006.277314
Filename :
4054784
Link To Document :
بازگشت