Title :
Protein Interaction Hotspot Identification Using Sequence-Based Frequency-Derived Features
Author :
Quang-Thang Nguyen ; Fablet, Ronan ; Pastor, Dominique
Author_Institution :
Dept. of Signal & Commun., Telecom Bretagne, Brest, France
Abstract :
Finding good descriptors, capable of discriminating hotspot residues from others, is still a challenge in many attempts to understand protein interaction. In this paper, descriptors issued from the analysis of amino acid sequences using digital signal processing (DSP) techniques are shown to be as good as those derived from protein tertiary structure and/or information on the complex. The simulation results show that our descriptors can be used separately to predict hotspots, via a random forest classifier, with an accuracy of 79% and a precision of 75%. They can also be used jointly with features derived from tertiary structures to boost the performance up to an accuracy of 82% and a precision of 80%.
Keywords :
proteins; proteomics; DSP techniques; amino acid sequences; digital signal processing; protein interaction hotspot identification; protein tertiary structure; sequence based frequency derived features; Amino acids; Decision trees; Digital signal processing; Integrated circuits; Proteins; Radio frequency; Vegetation; DSP-based features; electron–ion interaction pseudopotential (EIIP); hotspots; ionization constant (IC); protein interaction; resonant recognition model (RRM); sequence-based features; Amino Acid Sequence; Amino Acids; Computational Biology; Computer Simulation; Models, Molecular; Protein Interaction Mapping; Proteins; Reproducibility of Results; Sequence Analysis, Protein;
Journal_Title :
Biomedical Engineering, IEEE Transactions on
DOI :
10.1109/TBME.2011.2161306