DocumentCode
2991578
Title
TKNN: An Improved KNN Algorithm Based on Tree Structure
Author
Juan, Li
Author_Institution
Sch. of Distance Educ., Shaanxi Normal Univ., Xi´´an, China
fYear
2011
fDate
3-4 Dec. 2011
Firstpage
1390
Lastpage
1394
Abstract
Text classification is the process of assigning document to a set of previously fixed categories. It is widely used in many applications, such as web page categorization, email spam filtering, and document indexing, etc. Many popular algorithms for text classification have been proposed, such as Naive Bayes, K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). However, these classification approaches do not perform well in multi-class text classification because they are well relied on linear classifiers. KNN is a simple and mature algorithm, but it cannot effectively solve the problem of overlapped categories borders, unbalanced class samples, k value determination, and overlarge search space. In this paper, we propose a new TKNN that absorb tree structure and adaptive k value method based on classical KNN algorithm. TKNN can overcome the shortcoming of KNN and improve the performance of multi-class text classification. Then the theoretical analysis and experimental results show TKNN can greatly enhance the classification efficiency than KNN.
Keywords
pattern classification; support vector machines; text analysis; tree data structures; KNN algorithm; TKNN; Web page categorization; document assignment; document indexing; email spam filtering; fixed categories; k-nearest neighbor; linear classifiers; naive Bayes; support vector machine; text classification; tree structure; Accuracy; Algorithm design and analysis; Buildings; Classification algorithms; Complexity theory; Text categorization; Training; KNN; TKNN; penalty parameter; tree structure; unbalanced class samples;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Security (CIS), 2011 Seventh International Conference on
Conference_Location
Hainan
Print_ISBN
978-1-4577-2008-6
Type
conf
DOI
10.1109/CIS.2011.310
Filename
6128351
Link To Document