مرکز منطقه ای اطلاع رساني علوم و فناوري - Predicting protein complexes via the integration of multiple biological information

DocumentCode :

573715

Title :

Predicting protein complexes via the integration of multiple biological information

Author :

Tang, Xiwei ; Wang, Jianxin ; Pan, Yi

Author_Institution :

Sch. of Inf. Sci. & Eng., Central South Univ., Changsha, China

fYear :

2012

fDate :

18-20 Aug. 2012

Firstpage :

174

Lastpage :

179

Abstract :

Protein complexes are a cornerstone of many biological processes and together they form various types of molecular machinery that perform a vast array of biological functions. An increase in the amount of protein-protein interaction (PPI) data enables a number of computational methods for predicting protein complexes. There are a mass of algorithms detecting complexes only consider the PPI data. However, the PPI data from high-throughout techniques is flooded with false interactions. In fact, the insufficiency of the PPI data significantly lowers the accuracy of these methods. In the current work, we develop a novel method named CMBI to discover protein complexes via the integration of multiple biological resources including gene expression profiles, essential protein information and PPI data. First, CMBI defines the functional similarity of each pair of interacting proteins based on the edge-clustering coefficient (ECC) from the PPI network and the Pearson correlation coefficient (PCC) from the gene expression data. Second, CMBI selects essential proteins as seeds to bnild the protein complex cores. During the growth process, the seeds´ essential protein neighbors and the neighbors whose functional similarity (FS) with the seeds are more than the threshold T will be added to the complex cores. After the complex cores are constructed, CMBI begins to generate protein complexes by attaching their direct neighbors with F S >; T to the cores. In addition to the essential proteins, CMBI also uses other proteins as seeds to expand protein complexes. To check the performance of CMBI, we compare the complexes discovered by CMBI with the ones found by other techniques by matching the predicted complexes against the reference complexes. We use subsequently GO::TermFinder to analyze the complexes predicted by various methods. Finally, the effect of parameter T is investigated. The results from GO functional enrichment and matching analyses show that CMBI performs signifi- antly better than the state-of-the-art methods. It means that it´s successful for us to integrate multiple biological information to identify protein complexes in the PPI network.

Keywords :

biochemistry; biological techniques; correlation methods; genetics; molecular biophysics; proteins; CMBI method; GO functional enrichment analysis; PPI network; Pearson correlation coefficient; biological functions; biological processing; computational methods; cornerstone; edge-clustering coefficient; false interactions; gene expression data; gene expression profiles; high-throughout techniques; matching analysis; molecular machinery; multiple biological information integration; parameter T effect; predicted complexes; predicting protein complexes; protein complex cores; protein-protein interaction data; reference complexes; state-of-the-art methods; Biological information theory; Clustering algorithms; Irrigation; Proteins; USA Councils;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems Biology (ISB), 2012 IEEE 6th International Conference on

Conference_Location :

Xi´an

Print_ISBN :

978-1-4673-4396-1

Electronic_ISBN :

978-1-4673-4397-8

Type :

conf

DOI :

10.1109/ISB.2012.6314132

Filename :

6314132

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=573715