• DocumentCode
    479507
  • Title

    Boosting Methods for Protein Fold Recognition: An Empirical Comparison

  • Author

    Krishnaraj, Yazhene ; Reddy, Chandan K.

  • Author_Institution
    Dept. of Comput. Sci., Wayne State Univ., Detroit, MI
  • fYear
    2008
  • fDate
    3-5 Nov. 2008
  • Firstpage
    393
  • Lastpage
    396
  • Abstract
    Protein fold recognition is the prediction of protein´s tertiary structure (Fold) given the protein´s sequence without relying on sequence similarity. Using machine learning techniques for protein fold recognition, most of the state-of-the-art research has focused on more traditional algorithms such as support vector machines (SVM), k-nearest neighbor (KNN) and neural networks (NN). In this paper, we present an empirical study of two variants of boosting algorithms - AdaBoost and LogitBoost for the problem of fold recognition. Prediction accuracy is measured on a dataset with proteins from 27 most populated folds from the SCOP database, and is compared with results from other literature using SVM, KNN and NN algorithms on the same dataset. Overall, boosting methods achieve 60% fold recognition accuracy on an independent test protein dataset which is the highest prediction achieved when compared with the accuracy values obtained with other methods proposed in the literature. Boosting algorithms have the potential to build efficient classification models in a very fast manner.
  • Keywords
    bioinformatics; learning (artificial intelligence); molecular configurations; neural nets; pattern classification; proteins; proteomics; support vector machines; AdaBoost; KNN; LogitBoost; SCOP database; SVM; boosting method; classification model; k-nearest neighbor; machine learning techniques; neural networks; protein fold recognition; protein sequence; protein tertiary structure prediction; support vector machines; Boosting; Machine learning; Machine learning algorithms; Neural networks; Nuclear magnetic resonance; Protein sequence; Support vector machine classification; Support vector machines; Testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    978-0-7695-3452-7
  • Type

    conf

  • DOI
    10.1109/BIBM.2008.83
  • Filename
    4684926