• DocumentCode
    2568518
  • Title

    Splice site detection in DNA sequences using a fast classification algorithm

  • Author

    Cervantes, Jair ; Li, XiaoOu ; Yu, Wen

  • Author_Institution
    Dept. of Comput. Sci., CINVESTAV, Mexico City, Mexico
  • fYear
    2009
  • fDate
    11-14 Oct. 2009
  • Firstpage
    2683
  • Lastpage
    2688
  • Abstract
    Support vector machines (SVMs) are known to be excellent algorithms for classification problems. The principal disadvantage of SVMs is due to its excessive training time in large data set, such as DNA sequences. This paper presents a novel SVMs classification method which reduces significantly the input data set using Bayesian technique. Using this system, we are able to predict with a high accuracy huge data sets in a reasonable time. The system has been tested successfully on large splice-junction gene sequences (DNA). Experimental results show that the accuracy obtained by the proposed algorithm is comparable (98.2) with other SVMs implementations such as SMO (98.4%), LibSVM (98.4%), and Simple SVM (97.6%). Furthermore the proposed approach is scalable to large data sets with high classification accuracy.
  • Keywords
    Bayes methods; bioinformatics; support vector machines; Bayesian technique; DNA sequences; SVM; fast classification algorithm; large data sets; splice site detection; splice-junction gene sequences; support vector machines; Bayesian methods; Bioinformatics; Biological information theory; Classification algorithms; DNA; Proteins; Sequences; Splicing; Support vector machine classification; Support vector machines; Bayesian classification; DNA; Large data sets; SVM; Splice sites detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on
  • Conference_Location
    San Antonio, TX
  • ISSN
    1062-922X
  • Print_ISBN
    978-1-4244-2793-2
  • Electronic_ISBN
    1062-922X
  • Type

    conf

  • DOI
    10.1109/ICSMC.2009.5346130
  • Filename
    5346130