DocumentCode
3318213
Title
Learning effective features for Chinese text categorization
Author
Luo, Dingsheng ; Wang, Xinhao ; Wu, Xihong ; Chi, Huisheng
Author_Institution
Nat. Lab. on Machine Perception, Peking Univ., Beijing, China
fYear
2005
fDate
30 Oct.-1 Nov. 2005
Firstpage
608
Lastpage
613
Abstract
Text categorization task always suffers from a high dimension problem, which leads the learning system to be in a status of either lower efficiency or lower performance. A number of feature selection methods have therefore been adopted or proposed for its dimensional reduction, such as DF, IG, Chi Square and so on. Unlike those traditional feature selection methods, in this paper, a feature selection method based on the idea of "discriminative learning" is presented, where those learned "effective" features rather than traditional "important" features are used to construct feature space. During learning effective features, a variant AdaBoost algorithm as well as a pairwise multiclass learning scheme are adopted. Simulation results show the presented method works well.
Keywords
classification; feature extraction; learning (artificial intelligence); text analysis; Chinese text categorization; dimensional reduction; discriminative learning; feature selection methods; pairwise multiclass learning scheme; variant AdaBoost algorithm; Bayesian methods; Classification tree analysis; Dictionaries; Feature extraction; Frequency; Information management; Learning systems; Machine learning; Nearest neighbor searches; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN
0-7803-9361-9
Type
conf
DOI
10.1109/NLPKE.2005.1598809
Filename
1598809
Link To Document