• DocumentCode
    684674
  • Title

    Prediction of protein thermostability based on relieff-SVM algorithm

  • Author

    Xingyu Xu ; Yanrui Ding

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Jiangnan Univ., Wuxi, China
  • fYear
    2012
  • fDate
    7-9 Dec. 2012
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    One of the major tasks in protein engineering is to understand the important factors for stabilizing thermophilic proteins and to discriminate them from mesophilic ones. In this study, the sequence and structure features of proteins were calculated. Then, ReliefF algorithm was used to find the vital features for protein thermostability, and a model was constructed using SVM to predict protein thermostability. The 10-fold cross-validation test results showed that the accuracies for thermophilic proteins and mesophilic proteins were 84.7% and 87.6% based on the selected sequence features. When using selected structure features, there are 76.1% thermophilic proteins and 80.3% mesophilic proteins could be correctly predicted. Also, the performances obtained by SVM, K-nearest neighbors algorithm and Decision tree for predicting protein thermostability were compared. The SVM method performed best. The results also indicated that contents of Gln, Lys, and Glu were the most important sequence attributes. At the structural level, the polar surface area, non-polar surface area and hydrophobicity were the key factors in distinguishing mesophilic proteins from thermophilic proteins.
  • Keywords
    biology computing; decision trees; feature selection; hydrophobicity; learning (artificial intelligence); molecular biophysics; pattern classification; proteins; support vector machines; 10-fold cross-validation test; Gln; Glu; K-nearest neighbors algorithm; Lys; ReliefF-SVM algorithm; decision tree; hydrophobicity; mesophilic proteins; polar surface area; protein engineering; protein structure features; protein thermostability prediction; proteins sequence; sequence attributes; sequence features selection; thermophilic proteins; SVM; protein thermostability; sequence feature; structure feature;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    Information Science and Control Engineering 2012 (ICISCE 2012), IET International Conference on
  • Conference_Location
    Shenzhen
  • Electronic_ISBN
    978-1-84919-641-3
  • Type

    conf

  • DOI
    10.1049/cp.2012.2259
  • Filename
    6755638