• DocumentCode
    644002
  • Title

    Chniese document classification using field association knowledge base

  • Author

    Li Wang ; Kui Jiang ; Xingyun Geng ; Yuanpeng Zhang ; Dong Zhou ; Jiancheng Dong

  • Author_Institution
    Dept. of Med. Informatiocs, Nantong Univ., Nantong, China
  • Volume
    03
  • fYear
    2012
  • fDate
    Oct. 30 2012-Nov. 1 2012
  • Firstpage
    1403
  • Lastpage
    1408
  • Abstract
    Field Association (FA) terms are a limited set of discriminating terms that offer human knowledge to identify document (text) fields. Field association knowledge base (FAKB) is composed of FA terms and their potential hierarchical relationship of the fields belongs to. The primary goal of this research is to build a system that can imitate the process whereby humans recognize the fields by looking at a few Chinese FA terms in a document (text). The documents classification experiment is made on two data collections under different circumstances, including 4000 and 1300 documents respectively. FAKB outperforms all the other statistical methods (SVMs, kNN, and NB) with the average accuracies of 97.7% and 89%. All the experimental results clearly prove that the presented novel method is effective in Chinese document classification.
  • Keywords
    document handling; knowledge based systems; natural language processing; pattern classification; statistical analysis; Chniese document classification; FAKB; discriminating terms; document fields; field association knowledge base; statistical methods; Accuracy; Data collection; Knowledge based systems; Niobium; Statistical analysis; Testing; Training data; document classification; field asociation knowledge base; field association terms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing and Intelligent Systems (CCIS), 2012 IEEE 2nd International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4673-1855-6
  • Type

    conf

  • DOI
    10.1109/CCIS.2012.6664616
  • Filename
    6664616