• DocumentCode
    1945372
  • Title

    Knowledge discovery method to accomplish English document classification

  • Author

    Ghada, Elmarhomy ; Atlam, Elsayed ; Hanafusa, Hiro ; Fuketa, Masao ; Morita, Kazuhiro ; Aoe, Jun-Ichi

  • Author_Institution
    Dept. of Inf. Sci. & Intelligent Syst., Tokushima Univ., Japan
  • fYear
    2005
  • fDate
    19-21 May 2005
  • Firstpage
    268
  • Abstract
    Although there is much research of text classification based on vector spaces using word information in the whole text, generally humans can recognize the field by finding the specific words. This paper describes what is field-associated term and how to discover field-associated terms, which exist in any text. In this paper, such words are called a field association (FA) word that can be directly related to the field classification. Five criteria of FA terms are defined for hierarchical fields. All of them are stored to field tree to make use of extraction of field-coherent passages for document classification. The presented approach is estimated by the simulation results of 140 fields text files of sports field and extended by 197 text field of civil engineering.
  • Keywords
    data mining; natural languages; text analysis; word processing; English document classification; field association word; field-associated term discovery; knowledge discovery; text classification; Civil engineering; Classification tree analysis; Data mining; Humans; Information science; Intelligent systems; Stability; Text categorization; Text recognition; Tree data structures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Active Media Technology, 2005. (AMT 2005). Proceedings of the 2005 International Conference on
  • Print_ISBN
    0-7803-9035-0
  • Type

    conf

  • DOI
    10.1109/AMT.2005.1505330
  • Filename
    1505330