DocumentCode
2024321
Title
The design and implementation of an excellent text categorization system
Author
Lu, Mingyu ; Diao, LiLi ; Lu, Yuchang ; Zhou, Lizhu
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Volume
1
fYear
2002
fDate
2002
Firstpage
459
Abstract
Based on the study of text classification techniques, a new text categorization method which uses a weight adjustment measure to improve a vector space model and naive Bayesian classifier is proposed, and an experimental text classification system CWZ is implemented to make comparison within various text classification approaches. Compared with many commercial text classification systems, the behavior of CWZ is much better. We introduce its framework, function, main modules and running environment, give our experimental results, and discuss a few important technical issues involved in the system to get some valuable conclusions. We also describe how to improve the vector space model and naive Bayesian classifier.
Keywords
Bayes methods; classification; expert systems; learning (artificial intelligence); text analysis; CWZ; naive Bayesian classifier; technical issues; text categorization system; text classification techniques; vector space model; weight adjustment measure; Automation; Bayesian methods; Computer science; Electronic mail; Extraterrestrial measurements; Intelligent control; Space technology; Text categorization; Text mining; Weight measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Control and Automation, 2002. Proceedings of the 4th World Congress on
Print_ISBN
0-7803-7268-9
Type
conf
DOI
10.1109/WCICA.2002.1022151
Filename
1022151
Link To Document