DocumentCode :
3100279
Title :
A novel text classification based on Mahalanobis distance
Author :
Zhang, Suli ; Pan, Xin
Author_Institution :
Sch. of Electr. & Inf. Technol., Changchun Inst. of Technol., Changchun, China
Volume :
3
fYear :
2011
fDate :
11-13 March 2011
Firstpage :
156
Lastpage :
158
Abstract :
In text mining field, The KNN (K Nearest Neighbors) is one of the oldest and simplest methods of text classification. But it is known to be sensitive to the distance (or similarity) function used in classifying a test instance, this disadvantage can cause low classification accuracy and limit the KNN classifier´s utilization in text classification in text mining. In this paper, we introduce Mahalanobis distance in text classification area, and proposed an algorithm (MDKNN) base on this theory. Experiment show that our method has comparable or better performance than KNN Classifier and Naïve Bayes classifier in text classification.
Keywords :
Bayes methods; data mining; pattern classification; text analysis; K nearest neighbors; KNN classifier; Mahalanobis distance; Naive Bayes classifier; text classification; text mining; Accuracy; Classification algorithms; Covariance matrix; Support vector machine classification; Text categorization; Text mining; Training; Chinese; KNN Classifier; Mahalanobis distance; Text Classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Research and Development (ICCRD), 2011 3rd International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-839-6
Type :
conf
DOI :
10.1109/ICCRD.2011.5764268
Filename :
5764268
Link To Document :
بازگشت