DocumentCode :
644002
Title :
Chniese document classification using field association knowledge base
Author :
Li Wang ; Kui Jiang ; Xingyun Geng ; Yuanpeng Zhang ; Dong Zhou ; Jiancheng Dong
Author_Institution :
Dept. of Med. Informatiocs, Nantong Univ., Nantong, China
Volume :
03
fYear :
2012
fDate :
Oct. 30 2012-Nov. 1 2012
Firstpage :
1403
Lastpage :
1408
Abstract :
Field Association (FA) terms are a limited set of discriminating terms that offer human knowledge to identify document (text) fields. Field association knowledge base (FAKB) is composed of FA terms and their potential hierarchical relationship of the fields belongs to. The primary goal of this research is to build a system that can imitate the process whereby humans recognize the fields by looking at a few Chinese FA terms in a document (text). The documents classification experiment is made on two data collections under different circumstances, including 4000 and 1300 documents respectively. FAKB outperforms all the other statistical methods (SVMs, kNN, and NB) with the average accuracies of 97.7% and 89%. All the experimental results clearly prove that the presented novel method is effective in Chinese document classification.
Keywords :
document handling; knowledge based systems; natural language processing; pattern classification; statistical analysis; Chniese document classification; FAKB; discriminating terms; document fields; field association knowledge base; statistical methods; Accuracy; Data collection; Knowledge based systems; Niobium; Statistical analysis; Testing; Training data; document classification; field asociation knowledge base; field association terms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing and Intelligent Systems (CCIS), 2012 IEEE 2nd International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4673-1855-6
Type :
conf
DOI :
10.1109/CCIS.2012.6664616
Filename :
6664616
Link To Document :
بازگشت