Title :
Identification of Splice Sites Based on Discrete Wavelet Transform and Support Vector Machine
Author :
Liu, Qianqian ; Wan, Shu-Wu ; Sun, Ying-Fei
Author_Institution :
Sch. of Inf. Sci. & Eng., Grad. Univ. of Chinese Acad. of Sci., Beijing
Abstract :
With the imminent completion of the Human Genome Project and the fast increase of many complete genomes of prokaryotes and eukaryotes, the task of organizing and understanding the generated sequence and structural data becomes more and more pressing and demands better and efficient analysis algorithms. Computational gene identification is an issue of obvious importance as a tool of identifying biologically relevant features that often cannot be found by traditional sequence database searching technique. It bases on the accurate gene structure and developing accurate splice sites identification algorithms plays an important role in the identification of gene structure in eukaryote organisms. In this paper, we proposed a novel approach and developed an accurate splice sites identification algorithm based on discrete wavelet transform learned by support vector machine, which can achieve a superior performance for the prediction of splice sites, without any explicit use of local features. What´s more we bring forward a new coding method for DNA sequence, so that the sequence data can be dealt with as image processing. The results show that our splice sites identification method is much better than or similar to other methods that are developed based on explicit statistical features, despite that it is so easy and simple. So this method could be applied in other signal detection problems for a try.
Keywords :
DNA; biology computing; discrete wavelet transforms; genetics; image coding; molecular biophysics; molecular configurations; support vector machines; DNA sequence; coding; discrete wavelet transform; eukaryotes; gene identification; gene structure; genomes; image processing; prokaryotes; sequence database searching technique; splice sites; support vector machine; Algorithm design and analysis; Bioinformatics; Biology computing; Discrete wavelet transforms; Genomics; Humans; Organizing; Pressing; Sequences; Support vector machines;
Conference_Titel :
Bioinformatics and Biomedical Engineering, 2008. ICBBE 2008. The 2nd International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-1747-6
Electronic_ISBN :
978-1-4244-1748-3
DOI :
10.1109/ICBBE.2008.20