Title of article :
Identifying translation initiation sites in prokaryotes using support vector machine
Author/Authors :
Gao، نويسنده , , Tingting and Yang، نويسنده , , Zhixia and Wang، نويسنده , , Yong and Jing، نويسنده , , Ling، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
6
From page :
644
To page :
649
Abstract :
Motivation: Gene identification in genomes has been a fundamental and long-standing task in bioinformatics and computational biology. Many computational methods have been developed to predict genes in prokaryote genomes by identifying translation initiation site (TIS) in transcript data. However, the pseudo-TISs at the genome level make these methods suffer from a high number of false positive predictions. In addition, most of the existing tools use an unsupervised learning framework, whose predictive accuracy may depend on the choice of specific organism. s: In this paper, we present a supervised learning method, support vector machine (SVM), to identify translation initiation site at the genome level. The features are extracted from the sequence data by modeling the sequence segment around predicted TISs as a position specific weight matrix (PSWM). We train the parameters of our SVM through well constructed positive and negative TIS datasets. Then we apply the method to recognize translation initiation sites in E. coli, B. subtilis, and validate our method on two GC-rich bacteria genomes: Pseudomonas aeruginosa and Burkholderia pseudomallei K96243. We show that translation initiation sites can be recognized accurately at the genome level by our method, irrespective of their GC content. Furthermore, we compare our method with four existing methods and demonstrate that our method outperform these methods by obtaining better performance in all the four organisms.
Keywords :
Translation initiation site prediction , Support vector machine , Position specific weight matrix
Journal title :
Journal of Theoretical Biology
Serial Year :
2010
Journal title :
Journal of Theoretical Biology
Record number :
1540008
Link To Document :
بازگشت