• DocumentCode
    3545612
  • Title

    Improving English-Vietnamese Word Alignment Using Translation Model

  • Author

    Nguyen, Giang Thanh ; Dinh, Dien

  • Author_Institution
    Linguistic R&D Dept., Kim Tu Dien Multilingual Data Center, Ho Chi Minh City, Vietnam
  • fYear
    2012
  • fDate
    Feb. 27 2012-March 1 2012
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Word alignment for a parallel corpus is the connection between the words/phrases in source language and the words/phrases in target language. The alignment result is an important input for many natural language processing applications. In this paper, we propose an approach to improve the English-Vietnamese word alignment result by using the alignment frequency that is presented in the translation model of SMT (Statistical Machine Translation). We also indicate 5 common error types of English-Vietnamese word alignment and propose the heuristic patterns to discover the alignment errors. The experimental results show the improvement compared to the result of GIZA++.
  • Keywords
    language translation; natural language processing; statistical analysis; English-Vietnamese word alignment; alignment errors; alignment frequency; heuristic patterns; natural language processing; statistical machine translation; translation model; Computational linguistics; Computational modeling; Feature extraction; Heuristic algorithms; Hidden Markov models; Measurement; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on
  • Conference_Location
    Ho Chi Minh City
  • Print_ISBN
    978-1-4673-0307-1
  • Type

    conf

  • DOI
    10.1109/rivf.2012.6169841
  • Filename
    6169841