DocumentCode
3202230
Title
The Capability Analysis on the Characteristic Selection Algorithm of Text Categorization Based on F1 Measure Value
Author
He Shaojun ; Cao Jin ; Guo Ruixu ; Wang Guijun
Author_Institution
Northern Electron. Instrum. Inst., Beijing, China
fYear
2012
fDate
8-10 Dec. 2012
Firstpage
742
Lastpage
746
Abstract
The text categorization is an important aspect in the processing of nature languages. It can be used to identify the categorization information within the nature languages, consequently, the clutter problem, directional detection and scout of information has been solved. The general processing of text categorization is proposed in this paper. Taken Sogou datasets as the target, the capability of several typical characteristic selection algorithms have been analyzed in KNN classification machine with different characteristic dimensions and classification methods, while the text categorization experiment is based on F1 measure value.
Keywords
natural language processing; pattern classification; text analysis; F1 measure value; KNN classification machine; Sogou datasets; capability analysis; categorization information; characteristic dimensions; characteristic selection algorithm; classification methods; directional detection; nature language processing; text categorization; text categorization experiment; Algorithm design and analysis; Classification algorithms; Computational modeling; Data models; Internet; Support vector machines; Text categorization; Characteristic Dimension; Characteristic Selection; F1 measure value; KNN Classification Machine; text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Instrumentation, Measurement, Computer, Communication and Control (IMCCC), 2012 Second International Conference on
Conference_Location
Harbin
Print_ISBN
978-1-4673-5034-1
Type
conf
DOI
10.1109/IMCCC.2012.180
Filename
6429015
Link To Document