DocumentCode :
2682526
Title :
Splice Site Recognition in DNA Sequences Using K-mer Frequency Based Mapping for Support Vector Machine with Power Series Kernel
Author :
Damasevicius, R.
Author_Institution :
Software Eng. Dept., Kaunas Univ. of Technol., Kaunas
fYear :
2008
fDate :
4-7 March 2008
Firstpage :
687
Lastpage :
692
Abstract :
Recognition of specific functionally-important DNA sequence fragments is considered one of the most important problems in bioinformatics. One type of such fragments is splice-junction (intron-exon or exon-intron) sites. Detection of splice-junction sites in DNA sequences is important for successful gene prediction. In this paper, support vector machine (SVM) is used for classification of DNA sequences and splice-site recognition. For optimal classification, four position-independent k-mer frequency based methods for mapping DNA sequences into SVM feature space are analyzed. Classification is performed using SVM power series kernels. Kernel parameters are optimized using a modification of the Nelder-Mead (downhill simplex) optimization method. Precision of classification is evaluated using F-measure, which is a combination of precision and recall metrics. Best classification results are achieved using 4-mers for exon-intron dataset (78%) and 6-mers for intron-exon dataset (70%) using 4-nucleotide frequencies.
Keywords :
DNA; biology computing; genetics; molecular biophysics; pattern classification; support vector machines; DNA sequence; F-measure; Nelder-Mead optimization; bioinformatics; gene prediction; k-mer frequency based mapping; optimal classification; power series kernel; splice site recognition; splice-junction; support vector machine; Bioinformatics; DNA; Frequency; Genetics; Kernel; Optimization methods; Proteins; Sequences; Support vector machine classification; Support vector machines; bioinformatics; feature mapping; k-mer frequency; machine learning; splice site recognition; support vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Complex, Intelligent and Software Intensive Systems, 2008. CISIS 2008. International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-0-7695-3109-0
Type :
conf
DOI :
10.1109/CISIS.2008.41
Filename :
4606754
Link To Document :
بازگشت