• DocumentCode
    3321809
  • Title

    Direct Discriminative Pattern Mining for Effective Classification

  • Author

    Cheng, Hong ; Yan, Xifeng ; Han, Jiawei ; Yu, Philip S.

  • Author_Institution
    Univ. of Illinois at Urbana-Champaign, Urbana, IL
  • fYear
    2008
  • fDate
    7-12 April 2008
  • Firstpage
    169
  • Lastpage
    178
  • Abstract
    The application of frequent patterns in classification has demonstrated its power in recent studies. It often adopts a two-step approach: frequent pattern (or classification rule) mining followed by feature selection (or rule ranking). However, this two-step process could be computationally expensive, especially when the problem scale is large or the minimum support is low. It was observed that frequent pattern mining usually produces a huge number of "patterns" that could not only slow down the mining process but also make feature selection hard to complete. In this paper, we propose a direct discriminative pattern mining approach, DDPMine, to tackle the efficiency issue arising from the two-step approach. DDPMine performs a branch-and-bound search for directly mining discriminative patterns without generating the complete pattern set. Instead of selecting best patterns in a batch, we introduce a "feature-centered" mining approach that generates discriminative patterns sequentially on a progressively shrinking FP-tree by incrementally eliminating training instances. The instance elimination effectively reduces the problem size iteratively and expedites the mining process. Empirical results show that DDPMine achieves orders of magnitude speedup without any downgrade of classification accuracy. It outperforms the state-of-the-art associative classification methods in terms of both accuracy and efficiency.
  • Keywords
    data mining; pattern classification; tree searching; DDPMine; associative classification methods; branch-and-bound search; direct discriminative pattern mining; feature-centered mining approach; instance elimination; pattern classification; progressively shrinking FP-tree; Association rules; Computational efficiency; Data mining; Explosives; Itemsets; Power generation; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
  • Conference_Location
    Cancun
  • Print_ISBN
    978-1-4244-1836-7
  • Electronic_ISBN
    978-1-4244-1837-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2008.4497425
  • Filename
    4497425