• Title of article

    The PDB is a Covering Set of Small Protein Structures

  • Author/Authors

    Daisuke Kihara، نويسنده , , Jeffrey Skolnick and Eugene I. Shakhnovich، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2003
  • Pages
    10
  • From page
    793
  • To page
    802
  • Abstract
    Structure comparisons of all representative proteins have been done. Employing the relative root mean square deviation (RMSD) from native enables the assessment of the statistical significance of structure alignments of different lengths in terms of a Z-score. Two conclusions emerge: first, proteins with their native fold can be distinguished by their Z-score. Second and somewhat surprising, all small proteins up to 100 residues in length have significant structure alignments to other proteins in a different secondary structure and fold class; i.e. 24.0% of them have 60% coverage by a template protein with a RMSD below 3.5 Å and 6.0% have 70% coverage. If the restriction that we align proteins only having different secondary structure types is removed, then in a representative benchmark set of proteins of 200 residues or smaller, 93% can be aligned to a single template structure (with average sequence identity of 9.8%), with a RMSD less than 4 Å, and 79% average coverage. In this sense, the current Protein Data Bank (PDB) is almost a covering set of small protein structures. The length of the aligned region (relative to the whole protein length) does not differ among the top hit proteins, indicating that protein structure space is highly dense. For larger proteins, non-related proteins can cover a significant portion of the structure. Moreover, these top hit proteins are aligned to different parts of the target protein, so that almost the entire molecule can be covered when combined. The number of proteins required to cover a target protein is very small, e.g. the top ten hit proteins can give 90% coverage below a RMSD of 3.5 Å for proteins up to 320 residues long. These results give a new view of the nature of protein structure space, and its implications for protein structure prediction are discussed.
  • Keywords
    PDB , protein structure space , relative RMSD , fragments , Protein structure comparison
  • Journal title
    Journal of Molecular Biology
  • Serial Year
    2003
  • Journal title
    Journal of Molecular Biology
  • Record number

    1243214