مرکز منطقه ای اطلاع رساني علوم و فناوري - An Improved Density-Based Method for Reducing Training Data in KNN

DocumentCode :

1845637

Title :

An Improved Density-Based Method for Reducing Training Data in KNN

Author :

Yongxia Jing ; Heping Gou ; Yaling Zhu

Author_Institution :

Dept. of Inf. Technol., Qiongtai Teachers Coll., Haikou, China

fYear :

2013

fDate :

21-23 June 2013

Firstpage :

972

Lastpage :

975

Abstract :

k-Nearest Neighbor (KNN) algorithm was an efficient text categorization algorithm in recall and accuracy, but the computational overhead of KNN was directly proportional to the sample size, so its classification speed was low in large-scale sample data. Aiming at this problem, the paper presented a density-based method for reducing training data, the method clustered each class of sample data into several clusters and reduced the noise sample data, and then combined some higher similar sample documents in each cluster into one document. Results of the experiment indicated that the method can reduce the computational overhead of KNN text classification, and the performance is approximately equal to those of the traditional KNN.

Keywords :

pattern classification; pattern clustering; text analysis; KNN algorithm; KNN text classification; classification speed; density-based method; documents; k-nearest neighbor algorithm; large-scale sample data; noise sample data reduction; sample data clustering; sample size; text categorization algorithm; training data reduction; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Noise; Support vector machine classification; Text categorization; Training; KNN text classification; samples reducing; similarity; text clustering;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computational and Information Sciences (ICCIS), 2013 Fifth International Conference on

Conference_Location :

Shiyang

Type :

conf

DOI :

10.1109/ICCIS.2013.261

Filename :

6643177

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1845637