DocumentCode
2643828
Title
Context representation using word sequences extracted from a news corpus
Author
Sekiya, Hiroshi ; Kondo, Takeshi ; Hashimoto, Makoto ; Takagi, Tomohiro
Author_Institution
Dept. of Comput. Sci., Meiji Univ., Kanagawa, Japan
fYear
2005
fDate
26-28 June 2005
Firstpage
783
Lastpage
786
Abstract
Word meaning changes dynamically depending on context. We need to specify the context to identify this meaning. However, context varies depending on specificity of the topic and the viewpoint of the writer. In this paper, we propose that a word sequence can be used to identify context. Both contexts identified by word sequences and word sets related to the contexts are shown concretely. We used 800,000 Reuters news articles, and extracted the word sets using the confabulation model and five statistical measures as relations. We compared the measures and found that cogency and mutual information were the most effective. We demonstrate the usefulness of the word sequence to identify the context.
Keywords
statistical analysis; text analysis; confabulation model; context representation; news article; statistical measure; word meaning; word sequence; Data mining; Humans; Information processing; Mutual information; Natural language processing; Natural languages;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Information Processing Society, 2005. NAFIPS 2005. Annual Meeting of the North American
Print_ISBN
0-7803-9187-X
Type
conf
DOI
10.1109/NAFIPS.2005.1548639
Filename
1548639
Link To Document