DocumentCode
2326961
Title
Improving the k-NN and applying it to Chinese text classification
Author
Yuan, Fang ; Yang, Liu ; Yu, Ge
Author_Institution
Coll. of Math. & Comput. Sci., Hebei Univ., China
Volume
3
fYear
2005
fDate
18-21 Aug. 2005
Firstpage
1547
Abstract
With the problems of applying k-NN to Chinese text classification, this paper gives some improvements on k-NN. Word segmentation based on dictionaries and statistics can increase the accuracy of the classification and reduce the number of dimensions. Applying genetic algorithm to learn the value of k can improve classification automatization. The gradual classification mode is good for improving classification efficiency. The experiment shows that those improvements on k-NN can improve the efficiency of Chinese text classification while maintain the higher accuracy.
Keywords
classification; genetic algorithms; text analysis; Chinese text classification; classification automatization; genetic algorithm; k-nearest neighbor; word segmentation; Computer science; Educational institutions; Electronic mail; Genetic algorithms; Information science; Internet; Mathematics; Statistics; Testing; Text categorization; Chinese text classification; genetic algorithm; gradual classification mode; k-Nearest Neighbor method; text preprocessing;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location
Guangzhou, China
Print_ISBN
0-7803-9091-1
Type
conf
DOI
10.1109/ICMLC.2005.1527190
Filename
1527190
Link To Document