• DocumentCode
    2643828
  • Title

    Context representation using word sequences extracted from a news corpus

  • Author

    Sekiya, Hiroshi ; Kondo, Takeshi ; Hashimoto, Makoto ; Takagi, Tomohiro

  • Author_Institution
    Dept. of Comput. Sci., Meiji Univ., Kanagawa, Japan
  • fYear
    2005
  • fDate
    26-28 June 2005
  • Firstpage
    783
  • Lastpage
    786
  • Abstract
    Word meaning changes dynamically depending on context. We need to specify the context to identify this meaning. However, context varies depending on specificity of the topic and the viewpoint of the writer. In this paper, we propose that a word sequence can be used to identify context. Both contexts identified by word sequences and word sets related to the contexts are shown concretely. We used 800,000 Reuters news articles, and extracted the word sets using the confabulation model and five statistical measures as relations. We compared the measures and found that cogency and mutual information were the most effective. We demonstrate the usefulness of the word sequence to identify the context.
  • Keywords
    statistical analysis; text analysis; confabulation model; context representation; news article; statistical measure; word meaning; word sequence; Data mining; Humans; Information processing; Mutual information; Natural language processing; Natural languages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Information Processing Society, 2005. NAFIPS 2005. Annual Meeting of the North American
  • Print_ISBN
    0-7803-9187-X
  • Type

    conf

  • DOI
    10.1109/NAFIPS.2005.1548639
  • Filename
    1548639