DocumentCode :
402857
Title :
A simple and efficient classifying algorithm
Author :
Wang, Jian-hui ; Zhou, Shui-geng ; Hu, Yun-fa
Author_Institution :
Dept. of Comput. & Inf. Technol., Fudan Univ., Shanghai, China
Volume :
1
fYear :
2003
fDate :
2-5 Nov. 2003
Firstpage :
51
Abstract :
Most of the present classifying methods are based on VSM (the vector space model), of which the widely used method is kNN (the k-nearest neighbors). But most of them are highly complicated on computation, and cannot be used on the occasion of classifying a large number of specimen and the classifier must be rebuilt to increment the training corpora in order to have tough scalability. Two new notions, mutual dependence and equivalent radius, are put forward in this paper. And then a new classifying algorithm based on the two notions, SECTILE is offered in this paper. Later SECTILE is applied to classifying Chinese documents and compared to kNN and CCC methods. The experimental results suggests that SECTILE outperforms kNN and CCC methods, and can be used online to classify a large number of specimen and has good scalability, while the precision and recall of classification are kept.
Keywords :
pattern classification; text analysis; Chinese documents; classifying methods; equivalent radius; k-nearest neighbors; mutual dependence; simple and efficient algorithm to classify texts based on equivalent radius and mutual dependence; Classification algorithms; Concrete; Cybernetics; Electronic mail; Information technology; Machine learning; Mutual information; Natural language processing; Scalability; Space technology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
Type :
conf
DOI :
10.1109/ICMLC.2003.1264441
Filename :
1264441
Link To Document :
بازگشت