• DocumentCode
    1850266
  • Title

    Prediction of Protein-Protein Interaction Using Distance Frequency of Amino Acids Grouped with their Physicochemical Properties

  • Author

    Zhang, Shao-Wu ; Cheng, Yong-mei ; Luo, Li ; Pan, Quan

  • Author_Institution
    Coll. of Autom., Northwestern Ploytechnical Univ., Xi´´an, China
  • fYear
    2011
  • fDate
    27-29 Sept. 2011
  • Firstpage
    70
  • Lastpage
    74
  • Abstract
    Protein-protein interactions (PPIs) play a key role in many cellular processes. These interactions form the basis of phenomena such as DNA replication and transcription, metabolic pathway, signaling pathway, and cell cycle control. Knowing how proteins interact with each other can help the biological scientist understand the molecular mechanism of the cell. Unfortunately, the experimental methods of identifying PPIs are both time-consuming and expensive. Therefore, developing computational approaches for predicting PPIs would be of significant value. Here, we propose a novel method for predicting the PPI using distance frequency of amino acids grouped with their physicochemical properties (hydrophobicity, normalized van der Waals volume, polarity and polarizability) and PCA. First, the 20 basic amino acids were divided into three groups according to the four kinds of physicochemical property values. Second, the distance frequency feature extraction method was introduced to represent the protein pairs, and also fused the feature vectors extracted with four physicochemical properties to form different feature vector sets. Third, the PCA method was used to reduce the vector dimension, and support vector machine was adopted as the classifier. The overall success rate of our method for hydrophobicity, normalized van der Waals volume, polarity and polarizability are 89.88%, 89.72%, 89.28% and 89.24% in 10CV test, which are 6.65%, 8.05%, 9.72% and 8.09% higher than that of Guo´s auto-covariance function feature extraction method respectively. The total predicting accuracy of fusing the four physicochemical properties arrives at 91.79%. The results show that the current approach is very promising for predicting PPI, and may become a useful tool in the relevant areas.
  • Keywords
    biology computing; principal component analysis; proteins; support vector machines; PCA; PPI; amino acids; distance frequency feature extraction method; physicochemical properties; protein-protein interaction; support vector machine; Accuracy; Amino acids; Bioinformatics; Feature extraction; Principal component analysis; Proteins; Support vector machines; PCA; distance frequency; protein-protein interaction; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bio-Inspired Computing: Theories and Applications (BIC-TA), 2011 Sixth International Conference on
  • Conference_Location
    Penang
  • Print_ISBN
    978-1-4577-1092-6
  • Type

    conf

  • DOI
    10.1109/BIC-TA.2011.53
  • Filename
    6046875