Title :
Prediction of Protein Coding Regions by Support Vector Machine
Author :
Shuo, Guo ; Yi-sheng, Zhu
Author_Institution :
Coll. of Inf. Eng., Dalian Maritime Univ. Dalian, Dalian, China
Abstract :
With the exponential growth of genomic sequences, there is an increasing demand to accurately identify protein coding regions from genomic sequences. Despite many progresses being made in the identification of protein coding regions by computational methods during recent years, the performances and efficiencies of the prediction methods still need to be improved. A novel method to predict the position of coding regions is proposed. First, a support vector machine is used as a classifier to recognize the first nucleotide of a codon in a coding region. Then, according to the difference of the time frequency characteristics of the output values of the classifier analyzed by short time Fourier transform, the position of coding regions can be accurately determinate. The algorithm is not only can predict coding regions, but also can identify the first nucleotide of the codon in coding regions. This is very significant for accurate translation into a protein sequence. The simulation results show the proposed method is more effective for coding regions prediction than the existing coding region discovery tools.
Keywords :
DNA; Fourier transforms; bioinformatics; support vector machines; DNA sequence; bioinformatics; coding region discovery tools; genomic sequences; protein coding regions prediction; short time Fourier transform; support vector machine; Bioinformatics; DNA; Educational institutions; Fourier transforms; Genomics; Hidden Markov models; Protein engineering; Sequences; Support vector machine classification; Support vector machines; Coding Region in DNA Sequence; Codon; Short Time Fourier Transform; Support Vector Machine;
Conference_Titel :
Intelligent Ubiquitous Computing and Education, 2009 International Symposium on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3619-4
DOI :
10.1109/IUCE.2009.141