• DocumentCode
    2460709
  • Title

    A Comparison Study of Virus Classification by Genome Sequences

  • Author

    Wang, Jing-doo

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Asia Univ., Taichung, Taiwan
  • fYear
    2011
  • fDate
    24-26 Oct. 2011
  • Firstpage
    270
  • Lastpage
    273
  • Abstract
    In this study, instead of traditional approaches to virus classification, we proposed a novel approach in the vector space model for virus classification via two types of genome sequences, DNA and CDS. For DNA sequence, in this study, the k-mer approach was adopted for pattern extraction and the entropy of the pattern frequency distribution among classes was for pattern weighting. For CDS sequence, however, the pattern extraction was based on the identification of distinctive protein functions which were formed by CDS clustering and a weighting method, similar to tf * idf approach, for these protein functions was proposed. The experimental resources were download from NCBI and there were 35 classes (virus family) consisted of 1,877 viruses selected. The highest values of classification accuracy via SVM classifier were as high as 94.7% and 91.3% via DNA and CDS sequences, respectively. This study not only proposed a novel approach for virus classification but also provided a new methodology for comparative genomic analysis.
  • Keywords
    DNA; biology computing; cellular biophysics; genomics; microorganisms; molecular biophysics; physiological models; proteins; support vector machines; CDS clustering; DNA sequence; SVM classifier; classification accuracy; comparative genomic analysis; genome sequences; k-mer approach; pattern extraction; pattern frequency distribution; pattern weighting; protein functions; vector space model; virus classification; Accuracy; Bioinformatics; DNA; Encoding; Genomics; Vectors; Viruses (medical); Comparative genomics; genome sequence; virus classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering (BIBE), 2011 IEEE 11th International Conference on
  • Conference_Location
    Taichung
  • Print_ISBN
    978-1-61284-975-1
  • Type

    conf

  • DOI
    10.1109/BIBE.2011.47
  • Filename
    6089838