• DocumentCode
    2136135
  • Title

    Word Sense Disambiguation Based on Bayes Model and Information Gain

  • Author

    Yu Zhengtao ; Bin, Deng ; Bo, Hou ; Lu, Han ; Guo Jianyi

  • Author_Institution
    Sch. of Inf. Eng. & Autom., Kunming Univ. of Sci. & Technol., Kunming, China
  • Volume
    2
  • fYear
    2008
  • fDate
    13-15 Dec. 2008
  • Firstpage
    153
  • Lastpage
    157
  • Abstract
    Word sense disambiguation has always been a key problem in natural language processing. In the paper, we use the method of information gain to calculate the weight of different position´s context, which affect to ambiguous words. And take this as the foundation. We select the ahead and back six position¿s context of ambiguous words to construct the feature vectors. The feature vectors are endued with different value of weight in Bayesian model. Thus, the Bayesian model is improved. We use the sense of the HowNet to describe the meaning of ambiguous words. The average accuracy rate of the experiments of 10 Chinese ambiguous words was 95.72% in close test and the average accuracy rate was 85.71% in open test. The results showed that the method was proposed in this paper were very effective.
  • Keywords
    Bayes methods; computational linguistics; natural language processing; Bayesian model; Chinese word sense disambiguation; HowNet; feature vector construction; information gain; natural language processing; semantic relation; syntactic relation; Application software; Automation; Bayesian methods; Computer applications; Databases; Educational technology; Hidden Markov models; Information processing; Natural language processing; Testing; Bayesian Model; Information Gain; Natural Language Processing (NLP); Word Sense Disambiguation (WSD); weight of context position;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Future Generation Communication and Networking, 2008. FGCN '08. Second International Conference on
  • Conference_Location
    Hainan Island
  • Print_ISBN
    978-0-7695-3431-2
  • Type

    conf

  • DOI
    10.1109/FGCN.2008.188
  • Filename
    4734195