DocumentCode
3410523
Title
BioSPRINT: classification of intron and exon sequences using the SPRINT algorithm
Author
Crosby, Kevin ; Gabbert, Paula
Author_Institution
Furman Univ., Greenville, SC, USA
fYear
2004
fDate
16-19 Aug. 2004
Firstpage
668
Lastpage
669
Abstract
An important problem for computer scientists as well as geneticists involves classifying particular items into common groups. This paper focuses on classifying sequences of DNA as either an intron or an exon. Insights from this classification can reduce the time needed for laboratory work to distinguish between introns and exons. Using a classification tree based on the SPRINT algorithm, sequences from the Drosophila melanogaster and the Caenorhabditis elegans genomes were used for training and testing. A large test sample error rate of 15% was shown for the Drosophila melanogaster, whereas the Caenorhabditis elegans was only 1.6%.
Keywords
biology computing; classification; genetics; molecular biophysics; trees (mathematics); BioSPRINT; Caenorhabditis elegans genome; Drosophila melanogaster genome; SPRINT algorithm; classification tree; computer scientists; exon sequence classification; geneticists; intron sequence classification; Bioinformatics; Classification algorithms; Classification tree analysis; DNA; Data mining; Decision trees; Feature extraction; Genomics; Sequences; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN
0-7695-2194-0
Type
conf
DOI
10.1109/CSB.2004.1332540
Filename
1332540
Link To Document