DocumentCode :
27133
Title :
A Machine Learning Approach for Accurate Annotation of Noncoding RNAs
Author :
Yinglei Song ; Chunmei Liu ; Zhi Wang
Author_Institution :
Sch. of Electron. & Inf. Sci., Jiangsu Univ. of Sci. & Technol., Zhenjiang, China
Volume :
12
Issue :
3
fYear :
2015
fDate :
May-June 1 2015
Firstpage :
551
Lastpage :
559
Abstract :
Searching genomes to locate noncoding RNA genes with known secondary structure is an important problem in bioinformatics. In general, the secondary structure of a searched noncoding RNA is defined with a structure model constructed from the structural alignment of a set of sequences from its family. Computing the optimal alignment between a sequence and a structure model is the core part of an algorithm that can search genomes for noncoding RNAs. In practice, a single structure model may not be sufficient to capture all crucial features important for a noncoding RNA family. In this paper, we develop a novel machine learning approach that can efficiently search genomes for noncoding RNAs with high accuracy. During the search procedure, a sequence segment in the searched genome sequence is processed and a feature vector is extracted to represent it. Based on the feature vector, a classifier is used to determine whether the sequence segment is the searched ncRNA or not. Our testing results show that this approach is able to efficiently capture crucial features of a noncoding RNA family. Compared with existing search tools, it significantly improves the accuracy of genome annotation.
Keywords :
RNA; bioinformatics; genetics; genomics; learning (artificial intelligence); molecular biophysics; molecular configurations; RNA sequences; accurate noncoding RNA annotation; bioinformatics; feature vector; genomes; machine learning; noncoding RNA genes; secondary structure; single structure model; structural alignment; Accuracy; Bioinformatics; Computational modeling; Genomics; Hidden Markov models; IEEE transactions; RNA; Noncoding RNAs; classifier; feature vector; genome annotation; noncoding RNAs; secondary structure;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2014.2366758
Filename :
6945901
Link To Document :
بازگشت