DocumentCode
2276484
Title
Text Categorization Research Based on Cluster Idea
Author
Lin, Jialun ; Li, Xiaoling ; Jiao, Yuan
Author_Institution
Comput. Teaching & Res. Sect., Hainan Med. Coll., Haikou, China
Volume
1
fYear
2010
fDate
6-7 March 2010
Firstpage
483
Lastpage
486
Abstract
Classification and clustering are frequently-used methods in data excavation technology. This paper introduces the idea of text clustering into the categorization algorithm study. The authors also attempt to use the text categorization pattern of self´-initiated learning to design a clustering-based text categorization algorithm, in the purpose of reducing the dimension of training set and raising the efficiency of categorization implement. A series of experiments prove that this algorithm can greatly raise the efficiency while slightly reducing the accuracy of categorization, and then balance the contradiction between them.
Keywords
pattern classification; pattern clustering; statistical analysis; text analysis; unsupervised learning; word processing; cluster idea; clustering-based text categorization algorithm; data excavation; self-initiated learning; text categorization pattern; text clustering; training set; Algorithm design and analysis; Clustering algorithms; Clustering methods; Computer science; Computer science education; Educational technology; Management training; Partitioning algorithms; Testing; Text categorization; K-Means algorithm; KNN algorithm; text categorization; text clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Education Technology and Computer Science (ETCS), 2010 Second International Workshop on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-6388-6
Electronic_ISBN
978-1-4244-6389-3
Type
conf
DOI
10.1109/ETCS.2010.413
Filename
5458527
Link To Document