• DocumentCode
    3410523
  • Title

    BioSPRINT: classification of intron and exon sequences using the SPRINT algorithm

  • Author

    Crosby, Kevin ; Gabbert, Paula

  • Author_Institution
    Furman Univ., Greenville, SC, USA
  • fYear
    2004
  • fDate
    16-19 Aug. 2004
  • Firstpage
    668
  • Lastpage
    669
  • Abstract
    An important problem for computer scientists as well as geneticists involves classifying particular items into common groups. This paper focuses on classifying sequences of DNA as either an intron or an exon. Insights from this classification can reduce the time needed for laboratory work to distinguish between introns and exons. Using a classification tree based on the SPRINT algorithm, sequences from the Drosophila melanogaster and the Caenorhabditis elegans genomes were used for training and testing. A large test sample error rate of 15% was shown for the Drosophila melanogaster, whereas the Caenorhabditis elegans was only 1.6%.
  • Keywords
    biology computing; classification; genetics; molecular biophysics; trees (mathematics); BioSPRINT; Caenorhabditis elegans genome; Drosophila melanogaster genome; SPRINT algorithm; classification tree; computer scientists; exon sequence classification; geneticists; intron sequence classification; Bioinformatics; Classification algorithms; Classification tree analysis; DNA; Data mining; Decision trees; Feature extraction; Genomics; Sequences; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332540
  • Filename
    1332540