DocumentCode :
2425646
Title :
Text Categorization Method Based on Improved Mutual Information and Characteristic Weights Evaluation Algorithms
Author :
Pei, Zhili ; Shi, Xiaohu ; Marchese, Maurizio ; Liang, Yanchun
Author_Institution :
Jilin Univ., Changchun
Volume :
4
fYear :
2007
fDate :
24-27 Aug. 2007
Firstpage :
87
Lastpage :
91
Abstract :
The improvement of text categorization by statistical methods can be performed from two main directions, namely the feature selection and the evaluation of characteristic weights. In this paper, we propose an enhanced text categorization method based on a modified mutual information algorithm and evaluation algorithm of characteristic weights which improves both aspects. The proposed method is applied to the benchmark test set Reuters-21578 Top10 to examine its effectiveness. Numerical results show that the precision, the recall and the value of F1 of the proposed method are all superior to those of existing conventional methods.
Keywords :
statistical analysis; text analysis; benchmark test set Reuters-21578 Top10; characteristic weights evaluation algorithms; feature selection; mutual information algorithm; statistical methods; text categorization method; Communications technology; Computer science; Educational institutions; Frequency estimation; Frequency shift keying; Mutual information; Performance evaluation; Statistical analysis; Testing; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
Type :
conf
DOI :
10.1109/FSKD.2007.559
Filename :
4406359
Link To Document :
بازگشت