DocumentCode
644002
Title
Chniese document classification using field association knowledge base
Author
Li Wang ; Kui Jiang ; Xingyun Geng ; Yuanpeng Zhang ; Dong Zhou ; Jiancheng Dong
Author_Institution
Dept. of Med. Informatiocs, Nantong Univ., Nantong, China
Volume
03
fYear
2012
fDate
Oct. 30 2012-Nov. 1 2012
Firstpage
1403
Lastpage
1408
Abstract
Field Association (FA) terms are a limited set of discriminating terms that offer human knowledge to identify document (text) fields. Field association knowledge base (FAKB) is composed of FA terms and their potential hierarchical relationship of the fields belongs to. The primary goal of this research is to build a system that can imitate the process whereby humans recognize the fields by looking at a few Chinese FA terms in a document (text). The documents classification experiment is made on two data collections under different circumstances, including 4000 and 1300 documents respectively. FAKB outperforms all the other statistical methods (SVMs, kNN, and NB) with the average accuracies of 97.7% and 89%. All the experimental results clearly prove that the presented novel method is effective in Chinese document classification.
Keywords
document handling; knowledge based systems; natural language processing; pattern classification; statistical analysis; Chniese document classification; FAKB; discriminating terms; document fields; field association knowledge base; statistical methods; Accuracy; Data collection; Knowledge based systems; Niobium; Statistical analysis; Testing; Training data; document classification; field asociation knowledge base; field association terms;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud Computing and Intelligent Systems (CCIS), 2012 IEEE 2nd International Conference on
Conference_Location
Hangzhou
Print_ISBN
978-1-4673-1855-6
Type
conf
DOI
10.1109/CCIS.2012.6664616
Filename
6664616
Link To Document