Title :
A novel genome-wide polyadenylation sites recognition system based on condition random field
Author :
Jiuqiang Han ; Shanxin Zhang ; Jun Liu ; Ruiling Liu
Author_Institution :
Sch. of Electron. & Inf. Eng., Xi´an Jiaotong Univ., Xi´an, China
Abstract :
Polyadenylation including the cleavage of pre-mRNA and addition of a stretch of adenosines to the 3´-end is an essential step of pre-mRNA processing in eukayotes. The known regulatory role of polyadenylation in mRNA localization, stability, and translation and the emerging link between poly(A) and disease states underline the necessary to fully characterize polyadenylation sites. Several artificial intelligence methods have been proposed for poly(A) sites recognition. However, these methods are suitable to small subsets of genome sequences. It is necessary to propose a method for genome-wide recognition of poly(A) sites. Recent efforts have found a lot of poly(A) related factors on DNA level. Here, we proposed a novel genome-wide poly(A) recognition method based on the Condition Random Field (CRF) by integrating multiple features. Compared with the polya_svm (the most accurate program for prediction of poly(A) sites till date), our method had a higher performance with the area under ROC curve(0.8621 versus 0.6796). The result suggests that our method is an effective method in genome wide poly(A) sites recognition.
Keywords :
RNA; association; biochemistry; bioinformatics; diseases; dissociation; feature extraction; genomics; medical computing; molecular biophysics; molecular configurations; random processes; sequences; CRF method; DNA poly(A) related factors; adenosine stretch addition; artificial intelligence methods; condition random field method; disease states; eukayote pre-mRNA processing; genome sequences; genome-wide polyadenylation site recognition system; mRNA localization; mRNA stability; mRNA translation; multiple feature integration; poly(A) site prediction; poly(A) site recognition; polya_svm program; polyadenylation site characterization; pre-mRNA cleavage; Accuracy; Bioinformatics; Genomics; Proteins; Pulse width modulation; RNA; Support vector machines;
Conference_Titel :
Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE
Conference_Location :
Chicago, IL
DOI :
10.1109/EMBC.2014.6944687