DocumentCode
1945372
Title
Knowledge discovery method to accomplish English document classification
Author
Ghada, Elmarhomy ; Atlam, Elsayed ; Hanafusa, Hiro ; Fuketa, Masao ; Morita, Kazuhiro ; Aoe, Jun-Ichi
Author_Institution
Dept. of Inf. Sci. & Intelligent Syst., Tokushima Univ., Japan
fYear
2005
fDate
19-21 May 2005
Firstpage
268
Abstract
Although there is much research of text classification based on vector spaces using word information in the whole text, generally humans can recognize the field by finding the specific words. This paper describes what is field-associated term and how to discover field-associated terms, which exist in any text. In this paper, such words are called a field association (FA) word that can be directly related to the field classification. Five criteria of FA terms are defined for hierarchical fields. All of them are stored to field tree to make use of extraction of field-coherent passages for document classification. The presented approach is estimated by the simulation results of 140 fields text files of sports field and extended by 197 text field of civil engineering.
Keywords
data mining; natural languages; text analysis; word processing; English document classification; field association word; field-associated term discovery; knowledge discovery; text classification; Civil engineering; Classification tree analysis; Data mining; Humans; Information science; Intelligent systems; Stability; Text categorization; Text recognition; Tree data structures;
fLanguage
English
Publisher
ieee
Conference_Titel
Active Media Technology, 2005. (AMT 2005). Proceedings of the 2005 International Conference on
Print_ISBN
0-7803-9035-0
Type
conf
DOI
10.1109/AMT.2005.1505330
Filename
1505330
Link To Document