DocumentCode
2910575
Title
Character-Level System Combination: An Empirical Study for English-to-Chinese Spoken Language Translation
Author
Du, Jinhua
Author_Institution
Fac. of Autom. & Inf. Eng, Xi´´an Univ. of Technol., Xi´´an, China
fYear
2011
fDate
15-17 Nov. 2011
Firstpage
181
Lastpage
184
Abstract
This paper proposes a character-level system combination strategy for English -- Chinese spoken language translation. For languages like Chinese that the word boundaries are not orthographically marked, word segmentation which segments a Chinese sentence into a sequence of words, is often required for many Natural Language Processing tasks. In this paper we evaluate the impact of segmentation (spoken data) on the performance of system combination, and show that using inappropriate segmentation in system combination can result in inferior performance compared to single systems. We further demonstrate that using characters as basic translation unit in system combination on IWSLT ASR translation task leads to significant gains in translation quality in terms of BLEU and NIST scores.
Keywords
language translation; natural language processing; BLEU score; English-to-Chinese spoken language translation; IWSLT ASR translation task; NIST score; character-level system combination; natural language processing task; spoken data segmentation; translation quality; word boundary; word segmentation; Accuracy; Decoding; Heuristic algorithms; Hidden Markov models; Measurement; NIST; Training data; Character-level; Chinese Spoken Language Translation; System Combination;
fLanguage
English
Publisher
ieee
Conference_Titel
Asian Language Processing (IALP), 2011 International Conference on
Conference_Location
Penang
Print_ISBN
978-1-4577-1733-8
Type
conf
DOI
10.1109/IALP.2011.47
Filename
6121498
Link To Document