• DocumentCode
    3483566
  • Title

    Optimal approach to sequence-to-sequence prediction: applications in bioinformatics

  • Author

    Nguyen, Minh Ngoc ; Rajapakse, Jagath C.

  • Author_Institution
    Sch. of Comput. Eng., Nat. Technol. Univ., Singapore
  • Volume
    5
  • fYear
    2002
  • fDate
    18-22 Nov. 2002
  • Firstpage
    2254
  • Abstract
    We propose a two-stage approach to sequence-to-sequence prediction problem by using support vector machines (SVMs) to optimize the prediction from single stage techniques. The sequence-to-sequence prediction problem is common to many bioinformatics applications and we demonstrate our approach by using it on the protein secondary structure prediction problem. The new predictor combining different types of GOR (Gamier, Osguthorpe, and Robson) and Bayesian classifiers with SVMs achieves an accuracy of 70.9% when using the sevenfold cross validation on a database of 126 nonhomologous globular proteins. Extending the method to multiple sequence alignments of homologous proteins significantly increases the prediction accuracy to 72.1%. The results show that it is possible to obtain a higher accuracy with combined hierarchical classifiers than single stage classifiers alone, in the sequence prediction.
  • Keywords
    Bayes methods; biology computing; pattern classification; scientific information systems; support vector machines; Bayesian classifiers; GOR classifiers; SVMs; bioinformatics; nonhomologous globular protein database; optimal sequence-to-sequence prediction; protein secondary structure prediction problem; support vector machines; Accuracy; Application software; Bayesian methods; Bioinformatics; Databases; Neural networks; Proteins; Sequences; Statistics; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
  • Print_ISBN
    981-04-7524-1
  • Type

    conf

  • DOI
    10.1109/ICONIP.2002.1201894
  • Filename
    1201894