DocumentCode
3519444
Title
Functional Neighbors: Inferring Relationships between Non-Homologous Protein Families Using Family-Specific Packing Motifs
Author
Bandyopadhyay, D. ; Jun Huan ; Jinze Liu ; Prins, J. ; Snoeyink, J. ; Wei Wang ; Tropsha, A.
Author_Institution
GlaxoSmithKline, Collegeville, PA
fYear
2008
fDate
3-5 Nov. 2008
Firstpage
199
Lastpage
206
Abstract
We describe a new approach for inferring the functional relationships between non-homologous protein families by looking at statistical enrichment of alternative function predictions in classification hierarchies such as Gene Ontology (GO) and Structural Classification of Proteins (SCOP). Protein structures are represented by robust graphs, and the Fast Frequent Subgraph Mining algorithm is applied to protein families to generate sets of family-specific packing motifs, i.e. amino acid residue packing patterns shared by most family members but infrequent in other proteins. The function of a protein is inferred by identifying in it motifs characteristic of a known family. We employ these family-specific motifs to elucidate functional relationships between families in the GO and SCOP hierarchies. Specifically, we postulate that two families are functionally related if one family is statistically enriched by motifs characteristic of another family, i.e. if the number of proteins in a family containing a motif from another family is greater than expected by chance. This function inference method can help annotate proteins of unknown function, establish functional neighbors of existing families, and help specify alternate functions for known proteins.
Keywords
biology computing; data mining; genomics; ontologies (artificial intelligence); pattern classification; proteins; amino acid residue packing patterns; family-specific packing motifs; fast frequent subgraph mining algorithm; functional neighbors; gene ontology; genomics; inferring relationships; nonhomologous protein families; protein structural classification; Bioinformatics; Computer science; Educational institutions; Genomics; Lifting equipment; Pattern matching; Postal services; Protein engineering; Robustness; Solid modeling; enrichment; family-specific; frequent subgraph mining; functional neighbors; packing motifs; protein function inference; protein function prediction; remote homology; remote similarity; structural motifs;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
978-0-7695-3452-7
Type
conf
DOI
10.1109/BIBM.2008.84
Filename
4684893
Link To Document