• DocumentCode
    13689
  • Title

    Subgraph Matching with Set Similarity in a Large Graph Database

  • Author

    Liang Hong ; Lei Zou ; Xiang Lian ; Yu, Philip S.

  • Author_Institution
    Sch. of Inf. Manage., Wuhan Univ., Wuhan, China
  • Volume
    27
  • Issue
    9
  • fYear
    2015
  • fDate
    Sept. 1 2015
  • Firstpage
    2507
  • Lastpage
    2521
  • Abstract
    In real-world graphs such as social networks, Semantic Web and biological networks, each vertex usually contains rich information, which can be modeled by a set of tokens or elements. In this paper, we study a subgraph matching with set similarity (SMS2) query over a large graph database, which retrieves subgraphs that are structurally isomorphic to the query graph, and meanwhile satisfy the condition of vertex pair matching with the (dynamic) weighted set similarity. To efficiently process the SMS2 query, this paper designs a novel lattice-based index for data graph, and lightweight signatures for both query vertices and data vertices. Based on the index and signatures, we propose an efficient two-phase pruning strategy including set similarity pruning and structure-based pruning, which exploits the unique features of both (dynamic) weighted set similarity and graph topology. We also propose an efficient dominating-set-based subgraph matching algorithm guided by a dominating set selection algorithm to achieve better query performance. Extensive experiments on both real and synthetic datasets demonstrate that our method outperforms state-of-the-art methods by an order of magnitude.
  • Keywords
    graph theory; pattern matching; query processing; set theory; SMS2 query; data graph; dominating-set-based subgraph matching algorithm; dynamic set similarity; graph topology; large graph database query; lattice-based index; lightweight signatures; query vertices; real-world graphs; structurally isomorphic subgraphs; structure-based pruning; subgraph matching with set similarity query; synthetic datasets; two-phase pruning strategy; vertex pair matching; weighted set similarity; Heuristic algorithms; Indexes; Lattices; Pattern matching; Proteins; Upper bound; Subgraph matching; graph database; graph matching; index; set similarity;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2015.2391125
  • Filename
    7006728