• DocumentCode
    8907
  • Title

    An Integrated Framework for Functional Annotation of Protein Structural Domains

  • Author

    Lei Deng ; Zhigang Chen

  • Author_Institution
    Sch. of Software, Central South Univ., Changsha, China
  • Volume
    12
  • Issue
    4
  • fYear
    2015
  • fDate
    July-Aug. 1 2015
  • Firstpage
    902
  • Lastpage
    913
  • Abstract
    Structural domains are evolutionary and functional units of proteins and play a critical role in comparative and functional genomics. Computational assignment of domain function with high reliability is essential for understanding whole-protein functions. However, functional annotations are conventionally assigned onto full-length proteins rather than associating specific functions to the individual structural domains. In this article, we present Structural Domain Annotation (SDA), a novel computational approach to predict functions for SCOP structural domains. The SDA method integrates heterogeneous information sources, including structure alignment based protein-SCOP mapping features, InterPro2GO mapping information, PSSM Profiles, and sequence neighborhood features, with a Bayesian network. By large-scale annotating Gene Ontology terms to SCOP domains with SDA, we obtained a database of SCOP domain to Gene Ontology mappings, which contains 162,000 out of the approximately 166,900 domains in SCOPe 2.03 (>97 percent) and their predicted Gene Ontology functions. We have benchmarked SDA using a single-domain protein dataset and an independent dataset from different species. Comparative studies show that SDA significantly outperforms the existing function prediction methods for structural domains in terms of coverage and maximum F-measure.
  • Keywords
    Bayes methods; biology computing; evolution (biological); genomics; molecular biophysics; molecular configurations; ontologies (artificial intelligence); proteins; Bayesian network; InterPro2GO mapping information; PSSM Profiles; SCOP structural domains; SCOPe 2.03; SDA method; benchmarked SDA; comparative genomics; computational assignment; evolutionary units; full-length proteins; functional annotation; functional genomics; functional units; heterogeneous information sources; individual structural domains; integrated framework; large-scale annotating gene ontology terms; maximum F-measure; protein structural domains; sequence neighborhood features; single-domain protein dataset; structural domain annotation; structure alignment based protein-SCOP mapping; whole-protein functions; Bayes methods; Bioinformatics; Databases; Ontologies; Proteins; Support vector machines; Bayesian network; PSSM; Scop domain function; structure alignment;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2015.2389213
  • Filename
    7004798