• DocumentCode
    3545588
  • Title

    An Approach to Word Sense Disambiguation in English-Vietnamese-English Statistical Machine Translation

  • Author

    Nguyen, Quy ; Nguyen, An ; Dinh, Dien

  • Author_Institution
    Fac. of Comput. Sci., Univ. of Inf. Technol., Ho Chi Minh City, Vietnam
  • fYear
    2012
  • fDate
    Feb. 27 2012-March 1 2012
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    The most difficult problem of machine translation (MT) in general and statistical machine translation (SMT) in particular is to select the correct meaning of the polysemous words. Their correct meaning mainly depends on the context and the topic of the text. Therefore, to improve the quality of SMT by resolving semantic ambiguity of words, we integrate more knowledge about the topic of the text, part-of-speech (POS) and morphology. We applied this model to English-Vietnamese- English SMT system and BLEU scores increased over 6% compared with the baseline general SMT system, which was not integrated information about the topic or other language knowledge.
  • Keywords
    language translation; natural language processing; text analysis; BLEU scores; English-Vietnamese-English statistical machine translation; morphology; part-of-speech; polysemous words; text context; text topic; word sense disambiguation; Buildings; Semantics; Support vector machines; Tagging; Testing; Training; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on
  • Conference_Location
    Ho Chi Minh City
  • Print_ISBN
    978-1-4673-0307-1
  • Type

    conf

  • DOI
    10.1109/rivf.2012.6169839
  • Filename
    6169839