Title :
SECTCS: towards improving VSM and Naive Bayesian classifier
Author :
Lu, Mingyu ; Hu, Keyun ; Wu, Yi ; Lu, Yuchang ; Zhou, Lizhu
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Abstract :
Based on the study of a text classification technique, we propose a new text classification method which improves the Vector Space Model and Naive Bayesian classifier by using a weight adjustment measure, implement an experimental text classification system SECTCS (Smart English and Chinese Text Classification System), and make a comparison between various text classification approaches by using SECTCS. Compared with many commercial text classification systems, the behavior of SECTCS is excellent. We introduce its framework, function and running environment, give our experimental results, and discuss a few important technical issues involved in the system to get some valuable conclusions. We also describe how to improve the Vector Space Model and Naive Bayesian classifier in detail.
Keywords :
Bayes methods; classification; data mining; information resources; text analysis; Naive Bayesian classifier; SECTCS; Smart English and Chinese Text Classification System; VSM; Vector Space Model; experimental results; text classification technique; text mining; weight adjustment measure; Bayesian methods; Computer science; Data mining; Information resources; Machine learning; Mathematics; Performance analysis; Space technology; Text categorization; Text mining;
Conference_Titel :
Systems, Man and Cybernetics, 2002 IEEE International Conference on
Print_ISBN :
0-7803-7437-1
DOI :
10.1109/ICSMC.2002.1176403