• DocumentCode
    2517183
  • Title

    Prokaryote Gene Data Classifier Design Based on SVM

  • Author

    Li Xiao-xia ; Sun Bo ; Han Xue-mei ; Zhang Ji-hong

  • Author_Institution
    Sch. of Inf. Eng., Southwest Univ. of Sci. & Technol., Mianyang, China
  • fYear
    2009
  • fDate
    11-13 June 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Gene Recognition is one of the important problems in bioinformatics, including a lot of classic experiments, theory and arithmetic research. The E. coli K12 whole genome sequence and gene mark files from GeneBank were analyzed for later gene prediction. First the gene four distribution types were analyzed. Then the non-coding samples were generated from intervals between the discrete genes and the training set was constructed with all gene samples and nongene fragments. Thirdly the GC ratio and length features probability density of the training samples were plotted using Parzen window method. The average GC ratio of gene and non-coding samples are 0.51 and 0.45 separately. The average length of gene and non-coding samples are 954 and 164 nucleotides separately. At last Fisher linear classifier and Support vector machine (SVM) were used to classify the gene and nongene patterns. The results show that the least squares support vector machines error rate is 14.8%, which is 1.3% less than fisher classifier.
  • Keywords
    bioinformatics; genetics; least squares approximations; molecular biophysics; support vector machines; E. coli K12; Fisher linear classifier; GC ratio; GeneBank; Parzen window method; bioinformatics; gene mark files; gene prediction; gene recognition; least squares SVM; probability density; prokaryote gene data classifier design; support vector machine; whole genome sequence; Bioinformatics; Gene expression; Genomics; Hidden Markov models; Kernel; Least squares methods; Neural networks; Sequences; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-2901-1
  • Electronic_ISBN
    978-1-4244-2902-8
  • Type

    conf

  • DOI
    10.1109/ICBBE.2009.5163250
  • Filename
    5163250