Title :
A New Density-Based Method for Reducing the Amount of Training Data in k-NN Text Classification
Author :
Yuan, Fang ; Yang, Liu ; Yu, Ge
Author_Institution :
Hebei Univ., Baoding
Abstract :
With the rapid development of WWW, text classification has become the key technology in organizing and processing large amount of text data. As a simple, effective and nonparametric classification method, k-NN method is widely used in text classification. But k-NN clasifier not only has large computational demands, but also may decrease the precision of classification because of uneven density of training data. In this paper, a new density-based method for reducing the amount of training data is presented, which not only reduces the computational demands of k-NN classifier, but also improves the classification precision. The experiments show that the new method has better performance than the traditional k-NN method.
Keywords :
classification; text analysis; density-based method; k-nearest neighbor classifier; text classification; Computer science; Cybernetics; Educational institutions; Internet; Machine learning; Mathematics; Nearest neighbor searches; Testing; Text categorization; Training data; Density; K-Nearest Neighbor; Text classification; Training data;
Conference_Titel :
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-0973-0
Electronic_ISBN :
978-1-4244-0973-0
DOI :
10.1109/ICMLC.2007.4370730