Title :
Predicting Interacting Residues Using Long-Distance Information and Novel Decoding in Hidden Markov Models
Author :
Kern, Colin ; Gonzalez, A.J. ; Li Liao ; Vijay-Shanker, K.
Author_Institution :
Dept. of Comput. & Inf. Sci., Univ. of Delaware, Newark, DE, USA
Abstract :
Identification of interacting residues involved in protein-protein and protein-ligand interaction is critical for the prediction and understanding of the interaction and has practical impact on mutagenesis and drug design. In this work, we introduce a new decoding algorithm, ETB-Viterbi, with an early traceback mechanism, and apply it to interaction profile hidden Markov models (ipHMMs) to enable optimized incorporation of long-distance correlations between interacting residues, leading to improved prediction accuracy. The method was applied and tested to a set of domain-domain interaction families from the 3DID database, and showed statistically significant improvement in accuracy measured by F-score. To gauge and assess the method´s effectiveness and robustness in capturing the correlation signals, sets of simulated data based on the 3DID dataset with controllable correlation between interacting residues were also used, as well as reversed sequence orientation. It was demonstrated that the prediction consistently improves as the correlations increase and is not significantly affected by sequence orientation.
Keywords :
bioinformatics; hidden Markov models; molecular biophysics; molecular configurations; optimisation; proteins; sequences; sequential decoding; 3DID database; 3DID dataset; ETB-Viterbi decoding algorithm; F-score; correlation signals; domain-domain interaction families; drug design; hidden Markov models; interacting residues prediction; long-distance information; mutagenesis; prediction accuracy; protein-ligand interaction; protein-protein interaction; reversed sequence orientation; simulated data base; traceback mechanism; Accuracy; Amino acids; Correlation; Decoding; Hidden Markov models; Proteins; Viterbi algorithm; Bioinformatics; decoding; hidden Markov models; proteins; Algorithms; Cluster Analysis; Computational Biology; Computer Simulation; Markov Chains; Protein Interaction Mapping; Proteins; Sequence Analysis, Protein;
Journal_Title :
NanoBioscience, IEEE Transactions on
DOI :
10.1109/TNB.2013.2263810