Network Similarity Decomposition (NSD): A Fast and Scalable Approach to Network Alignment

Author

Kollias, Giorgos ; Mohammadi, Shahin ; Grama, Ananth

Author_Institution

Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA

Volume

24

Issue

12

fYear

2012

Firstpage

2232

Lastpage

2243

Abstract

As graph-structured data sets become commonplace, there is increasing need for efficient ways of analyzing such data sets. These analyses include conservation, alignment, differentiation, and discrimination, among others. When defined on general graphs, these problems are considerably harder than their well-studied counterparts on sets and sequences. In this paper, we study the problem of global alignment of large sparse graphs. Specifically, we investigate efficient methods for computing approximations to the state-of-the-art IsoRank solution for finding pairwise topological similarity between nodes in two networks (or within the same network). Pairs of nodes with high similarity can be used to seed global alignments. We present a novel approach to this computationally expensive problem based on uncoupling and decomposing ranking calculations associated with the computation of similarity scores. Uncoupling refers to independent preprocessing of each input graph. Decomposition implies that pairwise similarity scores can be explicitly broken down into contributions from different link patterns traced back to a low-rank approximation of the initial conditions for the computation. These two concepts result in significant improvements, in terms of computational cost, interpretability of similarity scores, and nature of supported queries. We show over two orders of magnitude improvement in performance over IsoRank/Random Walk formulations, and over an order of magnitude improvement over constrained matrix-triple-product formulations, in the context of real data sets.

Keywords

data analysis; graph theory; matrix algebra; query processing; IsoRank solution; NSD; Random Walk formulations; computational cost; conservation; constrained matrix-triple-product formulations; data set analysis; differentiation; discrimination; graph-structured data sets; network alignment; network similarity decomposition; pairwise similarity scores; pairwise topological similarity; ranking calculation decomposition; ranking calculation uncoupling; similarity score computation; similarity scores interpretability; sparse graph global alignment; supported query nature; Approximation methods; Computational modeling; Context awareness; Mathematical model; Stability analysis; Data mining; and very large systems; singular value decomposition; sparse; structured;

fLanguage

English

Journal_Title

Knowledge and Data Engineering, IEEE Transactions on

Publisher

ieee

ISSN

1041-4347

Type

jour

DOI

10.1109/TKDE.2011.174

Filename

5975146