DocumentCode :
3545612
Title :
Improving English-Vietnamese Word Alignment Using Translation Model
Author :
Nguyen, Giang Thanh ; Dinh, Dien
Author_Institution :
Linguistic R&D Dept., Kim Tu Dien Multilingual Data Center, Ho Chi Minh City, Vietnam
fYear :
2012
fDate :
Feb. 27 2012-March 1 2012
Firstpage :
1
Lastpage :
4
Abstract :
Word alignment for a parallel corpus is the connection between the words/phrases in source language and the words/phrases in target language. The alignment result is an important input for many natural language processing applications. In this paper, we propose an approach to improve the English-Vietnamese word alignment result by using the alignment frequency that is presented in the translation model of SMT (Statistical Machine Translation). We also indicate 5 common error types of English-Vietnamese word alignment and propose the heuristic patterns to discover the alignment errors. The experimental results show the improvement compared to the result of GIZA++.
Keywords :
language translation; natural language processing; statistical analysis; English-Vietnamese word alignment; alignment errors; alignment frequency; heuristic patterns; natural language processing; statistical machine translation; translation model; Computational linguistics; Computational modeling; Feature extraction; Heuristic algorithms; Hidden Markov models; Measurement; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on
Conference_Location :
Ho Chi Minh City
Print_ISBN :
978-1-4673-0307-1
Type :
conf
DOI :
10.1109/rivf.2012.6169841
Filename :
6169841
Link To Document :
بازگشت