DocumentCode :
1937114
Title :
Efficient KNN Text Categorization Based on Multiedit and Condensing Techniques
Author :
Hao, Xiu-Lan ; Zhang, Cheng-Hong ; Wang, Shu-Yun ; Tao, Xiao-Peng ; Hu, Yun-Fa
Author_Institution :
Fudan Univ., Shanghai
Volume :
6
fYear :
2007
fDate :
19-22 Aug. 2007
Firstpage :
3571
Lastpage :
3576
Abstract :
As a simple and effective classification approach, KNN is widely used in text categorization. However, KNN classifier not only has the large computational and store requirements, but also deteriorates performance of classification because of uneven distribution of training data. In this paper, we present a combinational technique, multi-edit-nearest-neighbor and condensing techniques, for reducing the noises of training data and decreasing the cost of time and space. Our experiment results illustrate that this strategy can solve above problems effectively.
Keywords :
noise; pattern classification; text analysis; KNN text categorization; classification approach; combinational technique; condensing technique; multiedit-nearest-neighbor; noises reduction; training data; Convolution; Data visualization; Filters; Image generation; Machine learning; Noise generators; Oceans; Streaming media; Text categorization; Vectors; Condensing algorithm; K nearest neighbor; Multi-edit algorithm; Text categorization; Training corpus pruning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-0973-0
Electronic_ISBN :
978-1-4244-0973-0
Type :
conf
DOI :
10.1109/ICMLC.2007.4370766
Filename :
4370766
Link To Document :
بازگشت