DocumentCode :
2337388
Title :
Text categorization based on frequent patterns with term frequency
Author :
Chen, Xiao-Yun ; Chen, Yi ; Wang, Lei ; Yun-Fa, W.
Volume :
3
fYear :
2004
fDate :
26-29 Aug. 2004
Firstpage :
1610
Abstract :
The association categorization technology based on frequent patterns is recently presented, which build the classification rules by frequent patterns in various categories and classify the new text employing these rules. However, in the current association classification methods, shortage exists in two aspects when it is applied to classify text data: one is the method ignored the information about word´s frequency in a text; the other is, the method needs pruning rules when the mass rules are generated, but that leads the veracity of classifying to drop. Therefore, this paper presents a text categorization algorithm based on frequent pattern with term frequency, and obtains higher performance than other association categorization methods and some current text classification methods. Our study provides evidence that association rule mining can be used for the construction of fast and effective classifiers for automatic text categorization.
Keywords :
data mining; pattern classification; text analysis; trees (mathematics); association categorization technology; association rule mining; frequent patterns; mass rules; pruning rules; text categorization algorithm; text data classification; Association rules; Classification algorithms; Data mining; Frequency; Information technology; Least squares methods; Machine learning; Mathematics; Statistics; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN :
0-7803-8403-2
Type :
conf
DOI :
10.1109/ICMLC.2004.1382032
Filename :
1382032
Link To Document :
بازگشت