Title :
Protein Structure Classification Based on Conserved Hydrophobic Residues
Author :
Chowriappa, Pradeep ; Dua, Sumeet ; Kanno, Jinko ; Thompson, Hilary W.
Author_Institution :
Dept. of Comput. Sci., Louisiana Tech Univ., Ruston, LA, USA
Abstract :
Protein folding is frequently guided by local residue interactions that form clusters in the protein core. The interactions between residue clusters serve as potential nucleation sites in the folding process. Evidence postulates that the residue interactions are governed by the hydrophobic propensities that the residues possess. An array of hydrophobicity scales has been developed to determine the hydrophobic propensities of residues under different environmental conditions. In this work, we propose a graph-theory-based data mining framework to extract and isolate protein structural features that sustain invariance in evolutionary-related proteins, through the integrated analysis of five well-known hydrophobicity scales over the 3D structure of proteins. We hypothesize that proteins of the same homology contain conserved hydrophobic residues and exhibit analogous residue interaction patterns in the folded state. The results obtained demonstrate that discriminatory residue interaction patterns shared among proteins of the same family can be employed for both the structural and the functional annotation of proteins. We obtained on the average 90 percent accuracy in protein classification with a significantly small feature vector compared to previous results in the area. This work presents an elaborate study, as well as validation evidence, to illustrate the efficacy of the method and the correctness of results reported.
Keywords :
biology computing; data mining; graph theory; hydrophobicity; proteins; data mining; graph theory; hydrophobic residues; protein folding; protein structure classification; Bioinformatics; Bioinformatics (genome or protein) databases; Hydrophobicity scales; Protein Databases; Structural classification; data mining; hydrophobicity scales; protein folding; structural classification; subgraph mining.; Algorithms; Computational Biology; Computer Simulation; Databases, Genetic; Databases, Protein; Hydrogen Bonding; Hydrophobicity; Imaging, Three-Dimensional; Models, Chemical; Protein Binding; Protein Conformation; Protein Folding; Proteins; Sequence Analysis, Protein; Solvents;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2008.77