• DocumentCode
    1317180
  • Title

    Novel Nonlinear Knowledge-Based Mean Force Potentials Based on Machine Learning

  • Author

    Dong, Qiwen ; Zhou, Shuigeng

  • Author_Institution
    Shanghai Key Lab. of Intell. Inf. Process., Fudan Univ., Shanghai, China
  • Volume
    8
  • Issue
    2
  • fYear
    2011
  • Firstpage
    476
  • Lastpage
    486
  • Abstract
    The prediction of 3D structures of proteins from amino acid sequences is one of the most challenging problems in molecular biology. An essential task for solving this problem with coarse-grained models is to deduce effective interaction potentials. The development and evaluation of new energy functions is critical to accurately modeling the properties of biological macromolecules. Knowledge-based mean force potentials are derived from statistical analysis of proteins of known structures. Current knowledge-based potentials are almost in the form of weighted linear sum of interaction pairs. In this study, a class of novel nonlinear knowledge-based mean force potentials is presented. The potential parameters are obtained by nonlinear classifiers, instead of relative frequencies of interaction pairs against a reference state or linear classifiers. The support vector machine is used to derive the potential parameters on data sets that contain both native structures and decoy structures. Five knowledge-based mean force Boltzmann-based or linear potentials are introduced and their corresponding nonlinear potentials are implemented. They are the DIH potential (single-body residue-level Boltzmann-based potential), the DFIRE-SCM potential (two-body residue-level Boltzmann-based potential), the FS potential (two-body atom-level Boltzmann-based potential), the HR potential (two-body residue-level linear potential), and the T32S3 potential (two-body atom-level linear potential). Experiments are performed on well-established decoy sets, including the LKF data set, the CASP7 data set, and the Decoys “R”Us data set. The evaluation metrics include the energy Z score and the ability of each potential to discriminate native structures from a set of decoy structures. Experimental results show that all nonlinear potentials significantly outperform the corresponding Boltzmann-based or linear potentials, and the proposed discriminative framework is effective in developing - - knowledge-based mean force potentials. The nonlinear potentials can be widely used for ab initio protein structure prediction, model quality assessment, protein docking, and other challenging problems in computational biology.
  • Keywords
    Boltzmann machines; ab initio calculations; biology computing; knowledge engineering; learning (artificial intelligence); molecular biophysics; molecular configurations; proteins; support vector machines; DFIRE-SCM potential; DIH potential; ab initio protein structure; amino acid sequences; coarse-grained models; computational biology; knowledge-based mean force potentials; machine learning; mean force Boltzmann potential; molecular biology; nonlinear classifiers; proteins; statistical analysis; support vector machine; two-body atom-level Boltzmann-based potential; two-body atom-level linear potential; two-body residue-level linear potential; Computational biology; Force; IEEE Potentials; Knowledge based systems; Proteins; Support vector machines; Training; Mean force potential; nonlinear potential; protein docking.; protein structure prediction; Artificial Intelligence; Computational Biology; Knowledge Bases; Nonlinear Dynamics; Protein Conformation; Protein Folding; Proteins;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2010.86
  • Filename
    5567096