DocumentCode :
3410523
Title :
BioSPRINT: classification of intron and exon sequences using the SPRINT algorithm
Author :
Crosby, Kevin ; Gabbert, Paula
Author_Institution :
Furman Univ., Greenville, SC, USA
fYear :
2004
fDate :
16-19 Aug. 2004
Firstpage :
668
Lastpage :
669
Abstract :
An important problem for computer scientists as well as geneticists involves classifying particular items into common groups. This paper focuses on classifying sequences of DNA as either an intron or an exon. Insights from this classification can reduce the time needed for laboratory work to distinguish between introns and exons. Using a classification tree based on the SPRINT algorithm, sequences from the Drosophila melanogaster and the Caenorhabditis elegans genomes were used for training and testing. A large test sample error rate of 15% was shown for the Drosophila melanogaster, whereas the Caenorhabditis elegans was only 1.6%.
Keywords :
biology computing; classification; genetics; molecular biophysics; trees (mathematics); BioSPRINT; Caenorhabditis elegans genome; Drosophila melanogaster genome; SPRINT algorithm; classification tree; computer scientists; exon sequence classification; geneticists; intron sequence classification; Bioinformatics; Classification algorithms; Classification tree analysis; DNA; Data mining; Decision trees; Feature extraction; Genomics; Sequences; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN :
0-7695-2194-0
Type :
conf
DOI :
10.1109/CSB.2004.1332540
Filename :
1332540
Link To Document :
بازگشت