Title :
Building a Simple and Effective Text Categorization System using Relative Importance in Category
Author :
Yan, Bingheng ; Qian, Depei
Author_Institution :
Xi´´an Jiaotong Univ., Xi´´an
Abstract :
With the rapid development of World Wide Web, text categorization has become the key technology in organizing and processing large volume of document data. There are a variety of text categorization methods such as k nearest neighbor (kNN) and support vector machine (SVM). However, those methods are either too complicated or not effective enough. In this paper, we present a new method called relative importance in category (RIIC), which is simpler than most methods and has a lower time complexity. To verify the performance of RIIC, we build a text categorization system (TCS) based on RIIC and compare our system with TCS based on kNN and SVM. Experimental results show that in most cases the performance of RIIC is better than kNN and SVM.
Keywords :
classification; text analysis; document data; relative importance in category; text categorization system; time complexity; Costs; Data engineering; Filters; Machine learning; Nearest neighbor searches; Organizing; Support vector machine classification; Support vector machines; Text categorization; Web sites;
Conference_Titel :
Natural Computation, 2007. ICNC 2007. Third International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2875-5
DOI :
10.1109/ICNC.2007.289