• DocumentCode
    2337388
  • Title

    Text categorization based on frequent patterns with term frequency

  • Author

    Chen, Xiao-Yun ; Chen, Yi ; Wang, Lei ; Yun-Fa, W.

  • Volume
    3
  • fYear
    2004
  • fDate
    26-29 Aug. 2004
  • Firstpage
    1610
  • Abstract
    The association categorization technology based on frequent patterns is recently presented, which build the classification rules by frequent patterns in various categories and classify the new text employing these rules. However, in the current association classification methods, shortage exists in two aspects when it is applied to classify text data: one is the method ignored the information about word´s frequency in a text; the other is, the method needs pruning rules when the mass rules are generated, but that leads the veracity of classifying to drop. Therefore, this paper presents a text categorization algorithm based on frequent pattern with term frequency, and obtains higher performance than other association categorization methods and some current text classification methods. Our study provides evidence that association rule mining can be used for the construction of fast and effective classifiers for automatic text categorization.
  • Keywords
    data mining; pattern classification; text analysis; trees (mathematics); association categorization technology; association rule mining; frequent patterns; mass rules; pruning rules; text categorization algorithm; text data classification; Association rules; Classification algorithms; Data mining; Frequency; Information technology; Least squares methods; Machine learning; Mathematics; Statistics; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
  • Print_ISBN
    0-7803-8403-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2004.1382032
  • Filename
    1382032