• DocumentCode
    1496401
  • Title

    Mining Frequent Subgraph Patterns from Uncertain Graph Data

  • Author

    Zou, Zhaonian ; Li, Jianzhong ; Gao, Hong ; Zhang, Shuo

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
  • Volume
    22
  • Issue
    9
  • fYear
    2010
  • Firstpage
    1203
  • Lastpage
    1218
  • Abstract
    In many real applications, graph data is subject to uncertainties due to incompleteness and imprecision of data. Mining such uncertain graph data is semantically different from and computationally more challenging than mining conventional exact graph data. This paper investigates the problem of mining uncertain graph data and especially focuses on mining frequent subgraph patterns on an uncertain graph database. A novel model of uncertain graphs is presented, and the frequent subgraph pattern mining problem is formalized by introducing a new measure, called expected support. This problem is proved to be NP-hard. An approximate mining algorithm is proposed to find a set of approximately frequent subgraph patterns by allowing an error tolerance on expected supports of discovered subgraph patterns. The algorithm uses efficient methods to determine whether a subgraph pattern can be output or not and a new pruning method to reduce the complexity of examining subgraph patterns. Analytical and experimental results show that the algorithm is very efficient, accurate, and scalable for large uncertain graph databases. To the best of our knowledge, this paper is the first one to investigate the problem of mining frequent subgraph patterns from uncertain graph data.
  • Keywords
    approximation theory; computational complexity; data mining; NP-hard problem; approximate mining algorithm; data mining; error tolerance; expected support measurement; frequent subgraph pattern mining problem; pruning method; uncertain graph data; uncertain graph database; Graph mining; algorithm.; frequent subgraph pattern; uncertain graph;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.80
  • Filename
    5467072