DocumentCode :
2643828
Title :
Context representation using word sequences extracted from a news corpus
Author :
Sekiya, Hiroshi ; Kondo, Takeshi ; Hashimoto, Makoto ; Takagi, Tomohiro
Author_Institution :
Dept. of Comput. Sci., Meiji Univ., Kanagawa, Japan
fYear :
2005
fDate :
26-28 June 2005
Firstpage :
783
Lastpage :
786
Abstract :
Word meaning changes dynamically depending on context. We need to specify the context to identify this meaning. However, context varies depending on specificity of the topic and the viewpoint of the writer. In this paper, we propose that a word sequence can be used to identify context. Both contexts identified by word sequences and word sets related to the contexts are shown concretely. We used 800,000 Reuters news articles, and extracted the word sets using the confabulation model and five statistical measures as relations. We compared the measures and found that cogency and mutual information were the most effective. We demonstrate the usefulness of the word sequence to identify the context.
Keywords :
statistical analysis; text analysis; confabulation model; context representation; news article; statistical measure; word meaning; word sequence; Data mining; Humans; Information processing; Mutual information; Natural language processing; Natural languages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Information Processing Society, 2005. NAFIPS 2005. Annual Meeting of the North American
Print_ISBN :
0-7803-9187-X
Type :
conf
DOI :
10.1109/NAFIPS.2005.1548639
Filename :
1548639
Link To Document :
بازگشت