DocumentCode
3545612
Title
Improving English-Vietnamese Word Alignment Using Translation Model
Author
Nguyen, Giang Thanh ; Dinh, Dien
Author_Institution
Linguistic R&D Dept., Kim Tu Dien Multilingual Data Center, Ho Chi Minh City, Vietnam
fYear
2012
fDate
Feb. 27 2012-March 1 2012
Firstpage
1
Lastpage
4
Abstract
Word alignment for a parallel corpus is the connection between the words/phrases in source language and the words/phrases in target language. The alignment result is an important input for many natural language processing applications. In this paper, we propose an approach to improve the English-Vietnamese word alignment result by using the alignment frequency that is presented in the translation model of SMT (Statistical Machine Translation). We also indicate 5 common error types of English-Vietnamese word alignment and propose the heuristic patterns to discover the alignment errors. The experimental results show the improvement compared to the result of GIZA++.
Keywords
language translation; natural language processing; statistical analysis; English-Vietnamese word alignment; alignment errors; alignment frequency; heuristic patterns; natural language processing; statistical machine translation; translation model; Computational linguistics; Computational modeling; Feature extraction; Heuristic algorithms; Hidden Markov models; Measurement; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on
Conference_Location
Ho Chi Minh City
Print_ISBN
978-1-4673-0307-1
Type
conf
DOI
10.1109/rivf.2012.6169841
Filename
6169841
Link To Document