• DocumentCode
    495465
  • Title

    Identifying DNA Strands Using a Kernel of Classified Sequences

  • Author

    Tonsmann, Guillermo ; Pollock, David D. ; Gu, Wanjun ; Castoe, Todd A.

  • Author_Institution
    Park Univ., Austin, TX, USA
  • Volume
    3
  • fYear
    2009
  • fDate
    March 31 2009-April 2 2009
  • Firstpage
    703
  • Lastpage
    707
  • Abstract
    Automated DNA sequencing produces a large amount of raw DNA sequence data that then needs to be classified, organized, and annotated. One major application is the comparison of new DNA sequences with previously known classified sequences. In this paper we present a new approach to perform these comparisons. From a kernel of previously classified DNA sequences, we identify distinctive oligomers, or short DNA sequences, that are infrequent and thus highly unique within the kernel. We then search for the presence of these distinctive oligomers in the new unclassified DNA sequences. Their presence indicates a possible relation between a new DNA sequence and every previously classified DNA sequence that shares the distinctive oligomer. Ultimately, unclassified sequences are related to classified sequences with which they share the highest number of distinctive oligomers. We explain the details of our technique and show some experimental results in a kernel of immunoglobulin DNA sequences.
  • Keywords
    biology computing; pattern classification; DNA strands; automated DNA sequencing; classified sequences; immunoglobulin DNA sequences; Computer science; DNA; Data engineering; Databases; Genetic engineering; Genomics; Immune system; Kernel; Proteins; Sequences; DNA classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Engineering, 2009 WRI World Congress on
  • Conference_Location
    Los Angeles, CA
  • Print_ISBN
    978-0-7695-3507-4
  • Type

    conf

  • DOI
    10.1109/CSIE.2009.506
  • Filename
    5170932