• DocumentCode
    1041150
  • Title

    An efficient algorithm for discovering frequent subgraphs

  • Author

    Kuramochi, Michihiro ; Karypis, George

  • Author_Institution
    Dept. of Comput. Sci., Minnesota Univ., Minneapolis, MN, USA
  • Volume
    16
  • Issue
    9
  • fYear
    2004
  • Firstpage
    1038
  • Lastpage
    1051
  • Abstract
    Over the years, frequent itemset discovery algorithms have been used to find interesting patterns in various application areas. However, as data mining techniques are being increasingly applied to nontraditional domains, existing frequent pattern discovery approaches cannot be used. This is because the transaction framework that is assumed by these algorithms cannot be used to effectively model the data sets in these domains. An alternate way of modeling the objects in these data sets is to represent them using graphs. Within that model, one way of formulating the frequent pattern discovery problem is that of discovering subgraphs that occur frequently over the entire set of graphs. We present a computationally efficient algorithm, called FSG, for finding all frequent subgraphs in large graph data sets. We experimentally evaluate the performance of FSG using a variety of real and synthetic data sets. Our results show that despite the underlying complexity associated with frequent subgraph discovery, FSG is effective in finding all frequently occurring subgraphs in data sets containing more than 200,000 graph transactions and scales linearly with respect to the size of the data set.
  • Keywords
    computational complexity; data mining; graph theory; pattern recognition; very large databases; chemical compound data sets; data mining techniques; frequent itemset discovery algorithms; frequent pattern discovery problem; frequent subgraph discovery; graph data sets; scientific data sets; Chemical compounds; Data mining; Frequency; Itemsets; Labeling; 65; Index Terms- Data mining; chemical compound data sets.; frequent pattern discovery; scientific data sets;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2004.33
  • Filename
    1316833