Title :
The design and implementation of an excellent text categorization system
Author :
Lu, Mingyu ; Diao, LiLi ; Lu, Yuchang ; Zhou, Lizhu
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Abstract :
Based on the study of text classification techniques, a new text categorization method which uses a weight adjustment measure to improve a vector space model and naive Bayesian classifier is proposed, and an experimental text classification system CWZ is implemented to make comparison within various text classification approaches. Compared with many commercial text classification systems, the behavior of CWZ is much better. We introduce its framework, function, main modules and running environment, give our experimental results, and discuss a few important technical issues involved in the system to get some valuable conclusions. We also describe how to improve the vector space model and naive Bayesian classifier.
Keywords :
Bayes methods; classification; expert systems; learning (artificial intelligence); text analysis; CWZ; naive Bayesian classifier; technical issues; text categorization system; text classification techniques; vector space model; weight adjustment measure; Automation; Bayesian methods; Computer science; Electronic mail; Extraterrestrial measurements; Intelligent control; Space technology; Text categorization; Text mining; Weight measurement;
Conference_Titel :
Intelligent Control and Automation, 2002. Proceedings of the 4th World Congress on
Print_ISBN :
0-7803-7268-9
DOI :
10.1109/WCICA.2002.1022151