Author/Authors :
J.C. Nacher، نويسنده , , M. Hayashida، نويسنده , , T. Akutsu، نويسنده ,
Abstract :
Many biological studies have been focused on the study of proteins, since proteins are essential for most cell functions. Although proteins are unique, they share certain common properties. For example, well-defined regions within a protein can fold independently from the rest of the protein and have their own function. They are called protein domains, and served as protein building blocks.
In this article, we present a theoretical model for studying the protein domain networks, where one node of the network corresponds to one protein and two proteins are connected if they contain the same domain. The resulting distribution of nodes with a given degree, k, shows not only a power-law with negative exponent γ=-1, but it resembles the superposition of two power-law functions, one with a negative exponent and another with a positive exponent β=1. We call this distribution pattern “scale-free mixing”. To explain the emergence of this superposition of power-laws, we propose a basic model with two main components: (1) mutation and (2) duplication of domains. Precisely, duplication gives rise to complete subgraphs (i.e., cliques) on the network, thus for several values of k a large number of nodes with degree k is produced, which explains the positive power-law branch of the degree distribution.
In order to compare our model with experimental data, we generate protein domain networks with data from the UniProt Knowledgebase-Swissprot database for protein sequences and using InterPro, Pfam and Smart for domain databases. Our results indicate that the signal of this scale-free mixing pattern is also observed in the experimental data and it is conserved among organisms as Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster, Mus musculus, and Homo sapiens.