DocumentCode :
1289899
Title :
Network Similarity Decomposition (NSD): A Fast and Scalable Approach to Network Alignment
Author :
Kollias, Giorgos ; Mohammadi, Shahin ; Grama, Ananth
Author_Institution :
Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
Volume :
24
Issue :
12
fYear :
2012
Firstpage :
2232
Lastpage :
2243
Abstract :
As graph-structured data sets become commonplace, there is increasing need for efficient ways of analyzing such data sets. These analyses include conservation, alignment, differentiation, and discrimination, among others. When defined on general graphs, these problems are considerably harder than their well-studied counterparts on sets and sequences. In this paper, we study the problem of global alignment of large sparse graphs. Specifically, we investigate efficient methods for computing approximations to the state-of-the-art IsoRank solution for finding pairwise topological similarity between nodes in two networks (or within the same network). Pairs of nodes with high similarity can be used to seed global alignments. We present a novel approach to this computationally expensive problem based on uncoupling and decomposing ranking calculations associated with the computation of similarity scores. Uncoupling refers to independent preprocessing of each input graph. Decomposition implies that pairwise similarity scores can be explicitly broken down into contributions from different link patterns traced back to a low-rank approximation of the initial conditions for the computation. These two concepts result in significant improvements, in terms of computational cost, interpretability of similarity scores, and nature of supported queries. We show over two orders of magnitude improvement in performance over IsoRank/Random Walk formulations, and over an order of magnitude improvement over constrained matrix-triple-product formulations, in the context of real data sets.
Keywords :
data analysis; graph theory; matrix algebra; query processing; IsoRank solution; NSD; Random Walk formulations; computational cost; conservation; constrained matrix-triple-product formulations; data set analysis; differentiation; discrimination; graph-structured data sets; network alignment; network similarity decomposition; pairwise similarity scores; pairwise topological similarity; ranking calculation decomposition; ranking calculation uncoupling; similarity score computation; similarity scores interpretability; sparse graph global alignment; supported query nature; Approximation methods; Computational modeling; Context awareness; Mathematical model; Stability analysis; Data mining; and very large systems; singular value decomposition; sparse; structured;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2011.174
Filename :
5975146
Link To Document :
بازگشت