Title :
Optimizing period-3 methods for eukaryotic gene prediction
Author :
Akhtar, Mahmood ; Ambikairajah, Eliathamby ; Epps, Julien
Author_Institution :
Sch. of Electr. Eng. & Telecommun., New South Wales Univ., Sydney, NSW
fDate :
March 31 2008-April 4 2008
Abstract :
In this paper, we firstly investigate the effect of window lengths on selected signal processing-based gene and exon prediction methods. We then optimize these methods to improve their prediction accuracy by employing the best DNA representation, a suitable window length, and boosting the output signals to enhance protein coding and suppress the non-coding regions. It is shown herein that the proposed method outperforms major existing time-domain, frequency- domain, and combined time-frequency approaches. By comparison with the existing DFT-based methods, the proposed method not only requires 50% less processing but also exhibits relative improvements of 53.3%, 46.7%, and 24.2% respectively over spectral content, spectral rotation, and paired and weighted spectral rotation measures in terms of prediction accuracy of exonic nucleotides at a 5% false positive rate using the GENSCAN test set.
Keywords :
DNA; biology computing; cellular biophysics; encoding; genetics; molecular biophysics; prediction theory; proteins; signal processing; DNA representation; GENSCAN test set; eukaryotic gene prediction; exon prediction; exonic nucleotides; period-3 methods; prediction accuracy; protein coding; signal processing; spectral content; spectral rotation; Accuracy; Boosting; DNA; Optimization methods; Prediction methods; Proteins; Rotation measurement; Signal processing; Time domain analysis; Time frequency analysis; DNA; Discrete Fourier transform; Signal processing; genomic signal processing; sequence analysis;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4517686