Title :
Design of Chinese Text Categorization Classifier Based on Attribute Bagging
Author :
Zhang, Xiang ; Zhou, Mingquan ; Dong, Lili ; Ye, Na
Author_Institution :
Coll. of Inf. Sci. & Technol., Northwest Univ., Xi´´an, China
Abstract :
In order to improve the precise rate and recall rate of Chinese text classifier, an improved bagging algorithm - attribute bagging is used in this paper. Document is represented by vector space model and information gain is used to do the feature selection. Re-sampling attributes is used to get multiple training sets and the kNN is selected as the individual classifier. The classification result is attained by voting. Experiments show that the attribute bagging gets lower errors and better performance than bagging and kNN in Chinese text categorization.
Keywords :
support vector machines; text analysis; Chinese text categorization classifier; attribute bagging algorithm; information gain; multiple training set; resampling attributes; support vector machine; vector space model; Algorithm design and analysis; Bagging; Boosting; Control engineering; Educational institutions; Frequency; Information science; Machine learning; Space technology; Text categorization; Chinese text categorization; attribute bagging; information gain; vector space model;
Conference_Titel :
Business Intelligence and Financial Engineering, 2009. BIFE '09. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-0-7695-3705-4
DOI :
10.1109/BIFE.2009.55