Title :
Chinese text categorization study based on CBM learning
Author :
Zhan, Yan ; Chen, Hao
Author_Institution :
Key Lab. of Machine Learning & Comput. Intell., Hebei Univ., Baoding, China
Abstract :
Text Categorization (TC) is an important component in many information organization and information management tasks. In many TC applications, the case-base grows at a fast rate and this causes inefficiency in the case retrieval process. Using Case-Base Maintenance learning via the GC (Generalization Capability) algorithm, which can reduce the case number into KNN algorithm, can improve efficiency when indexing near neighbor in K-Nearest Neighbor algorithm. The numerical experiments prove the validity of this learning algorithm. Since K-NN algorithm is used extensively to a variety of areas, we can improve classification performance further in TC.
Keywords :
case-based reasoning; information retrieval; learning (artificial intelligence); pattern clustering; statistical analysis; text analysis; CBM learning; Chinese text categorization; K-NN algorithm; case base maintenance learning; generalization capability algorithm; information management; information retrieval process; k-nearest neighbor; Accuracy; Algorithm design and analysis; Classification algorithms; Databases; Machine learning; Text categorization; Training data; CBM; K-NN; Text Categorization;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
DOI :
10.1109/FSKD.2010.5569330