• DocumentCode
    1798314
  • Title

    DL-PRO: A novel deep learning method for protein model quality assessment

  • Author

    Nguyen, Son P. ; Yi Shang ; Dong Xu

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Missouri, Columbia, MO, USA
  • fYear
    2014
  • fDate
    6-11 July 2014
  • Firstpage
    2071
  • Lastpage
    2078
  • Abstract
    Computational protein structure prediction is very important for many applications in bioinformatics. In the process of predicting protein structures, it is essential to accurately assess the quality of generated models. Although many single-model quality assessment (QA) methods have been developed, their accuracy is not high enough for most real applications. In this paper, a new approach based on C-a atoms distance matrix and machine learning methods is proposed for single-model QA and the identification of native-like models. Different from existing energy/scoring functions and consensus approaches, this new approach is purely geometry based. Furthermore, a novel algorithm based on deep learning techniques, called DL-Pro, is proposed. For a protein model, DL-Pro uses its distance matrix that contains pairwise distances between two residues´ C-a atoms in the model, which sometimes is also called contact map, as an orientation-independent representation. From training examples of distance matrices corresponding to good and bad models, DL-Pro learns a stacked autoencoder network as a classifier. In experiments on selected targets from the Critical Assessment of Structure Prediction (CASP) competition, DL-Pro obtained promising results, outperforming state-of-the-art energy/scoring functions, including OPUS-CA, DOPE, DFIRE, and RW.
  • Keywords
    bioinformatics; learning (artificial intelligence); pattern classification; proteins; C-a atoms distance matrix; CASP competition; Critical Assessment of Structure Prediction competition; DFIRE; DL-PRO; DOPE; OPUS-CA; RW; bioinformatics; classifier; computational protein structure prediction; consensus approach; contact map; deep learning method; energy-scoring function; machine learning methods; native-like model identification; orientation-independent representation; pairwise distance; protein model quality assessment; single-model QA method; single-model quality assessment method; stacked autoencoder network; Classification algorithms; Computational modeling; Predictive models; Proteins; Solid modeling; Three-dimensional displays; Training; Critical Assessment of Structure Prediction (CASP); classification; deep learning; energy and scoring function; protein model quality assessment; stacked autoencoder;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), 2014 International Joint Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4799-6627-1
  • Type

    conf

  • DOI
    10.1109/IJCNN.2014.6889891
  • Filename
    6889891