DocumentCode :
2910575
Title :
Character-Level System Combination: An Empirical Study for English-to-Chinese Spoken Language Translation
Author :
Du, Jinhua
Author_Institution :
Fac. of Autom. & Inf. Eng, Xi´´an Univ. of Technol., Xi´´an, China
fYear :
2011
fDate :
15-17 Nov. 2011
Firstpage :
181
Lastpage :
184
Abstract :
This paper proposes a character-level system combination strategy for English -- Chinese spoken language translation. For languages like Chinese that the word boundaries are not orthographically marked, word segmentation which segments a Chinese sentence into a sequence of words, is often required for many Natural Language Processing tasks. In this paper we evaluate the impact of segmentation (spoken data) on the performance of system combination, and show that using inappropriate segmentation in system combination can result in inferior performance compared to single systems. We further demonstrate that using characters as basic translation unit in system combination on IWSLT ASR translation task leads to significant gains in translation quality in terms of BLEU and NIST scores.
Keywords :
language translation; natural language processing; BLEU score; English-to-Chinese spoken language translation; IWSLT ASR translation task; NIST score; character-level system combination; natural language processing task; spoken data segmentation; translation quality; word boundary; word segmentation; Accuracy; Decoding; Heuristic algorithms; Hidden Markov models; Measurement; NIST; Training data; Character-level; Chinese Spoken Language Translation; System Combination;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2011 International Conference on
Conference_Location :
Penang
Print_ISBN :
978-1-4577-1733-8
Type :
conf
DOI :
10.1109/IALP.2011.47
Filename :
6121498
Link To Document :
بازگشت