• DocumentCode
    1374612
  • Title

    Touring Protein Space with Matt

  • Author

    Daniels, Noah ; Kumar, Anoop ; Cowen, Lenore ; Menke, Matt

  • Author_Institution
    Tufts University, Medford
  • Volume
    9
  • Issue
    1
  • fYear
    2012
  • Firstpage
    286
  • Lastpage
    293
  • Abstract
    Using the Matt structure alignment program, we take a tour of protein space, producing a hierarchical clustering scheme that divides protein structural domains into clusters based on geometric dissimilarity. While it was known that purely structural, geometric, distance-based measures of structural similarity, such as Dali/FSSP, could largely replicate hand-curated schemes such as SCOP at the family level, it was an open question as to whether any such scheme could approximate SCOP at the more distant superfamily and fold levels. We partially answer this question in the affirmative, by designing a clustering scheme based on Matt that approximately matches SCOP at the superfamily level, and demonstrates qualitative differences in performance between Matt and DaliLite. Implications for the debate over the organization of protein fold space are discussed. Based on our clustering of protein space, we introduce the Mattbench benchmark set, a new collection of structural alignments useful for testing sequence aligners on more distantly homologous proteins.
  • Keywords
    Benchmark testing; Bioinformatics; Clustering algorithms; Indexes; Measurement; Proteins; Training; SCOP; automated classification.; fold space; hierarchical classification; structure alignment; Cluster Analysis; Computational Biology; Models, Molecular; Protein Conformation; Protein Folding; Proteins; Sequence Alignment; Sequence Analysis, Protein; Software;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2011.70
  • Filename
    6078456