Title :
Evaluating the Accuracy and Efficiency of Complex Network Classification Algorithms
Author :
Bray, Margaret ; Hertzberg, Vicki
Author_Institution :
Rollins Sch. of Public Health, Emory Univ., Atlanta, GA, USA
Abstract :
Determining the structure, and corresponding classification, of protein-protein interaction (PPI) networks allows scientists to make inferences regarding the functions of proteins and protein complexes. Knowledge of structure can also be used to discover previously unidentified interactions and direct biological research of these PPI networks. Classification of a PPI network is completed by comparing the empirical network to a variety of model networks types (i.e. Networks that are not directly based on any empirical network). Before attempting to classify any empirical network it is essential to determine each algorithms ability to sort known model networks into their respective categories. For this paper, five classification algorithms (degree distribution distance, characteristic curve, relative graphlet frequency, graphlet degree distribution, and cross scoring) were tested on their ability to sort nine different model networks types into their respective categories. The model networks were all based on the Saccharomyces cerevisiae PPI network. Results found the all of the algorithms, except cross scoring, to be extremely lacking in both accuracy and efficiency. The methods correctly classified networks less than 60% of the time suggesting that their ability to classify empirical PPI networks should be strongly questioned. Cross scoring had the highest scores in both accuracy and efficiency, correctly classifying 81% of networks correctly. This method, while drastically outperforming its competitors still has significant room for improvement.
Keywords :
biology computing; molecular biophysics; network theory (graphs); pattern classification; proteins; PPI network; Saccharomyces cerevisiae PPI network; characteristic curve classification; complex network classification algorithm; cross scoring classification; degree distribution distance classification; empirical network; graphlet degree distribution classification; model networks type; protein complex; protein-protein interaction network; relative graphlet frequency classification; Analytical models; Biological system modeling; Classification algorithms; Complex networks; Mathematical model; Orbits; Proteins; PPI networks; complex networks; graphlets; network classification;
Conference_Titel :
Signal-Image Technology and Internet-Based Systems (SITIS), 2014 Tenth International Conference on
DOI :
10.1109/SITIS.2014.75