Title :
Translating TED speeches by recurrent neural network based translation model
Author :
Youzheng Wu ; Xinhu Hu ; Hori, Chiori
Author_Institution :
Spoken Language Commun. Lab., Kyoto, Japan
Abstract :
This paper presents our recent progress on translating TED speeches1, a collection of public lectures covering a variety of topics. Specially, we use word-to-word alignment to compose translation units of bilingual tuples and present a recurrent neural network-based translation model (RNNTM) to capture long-span context during estimating translation probabilities of bilingual tuples. However, this RNNTM has severe data sparsity problem due to large tuple vocabulary and limited training data. Therefore, a factored RNNTM, which takes bilingual tuples in addition to source and target phrases of the tuples as input features, is proposed to partially address the problem. Our experimental results on the IWSLT2012 test sets show that the proposed models significantly improve the translation quality over state-of-the-art phrase-based translation systems.
Keywords :
language translation; natural language processing; recurrent neural nets; speech processing; TED speech translation; bilingual tuple; long-span context; public lecture collection; recurrent neural network; translation model; word-to-word alignment; Context; Context modeling; Mathematical model; Recurrent neural networks; Training data; Vocabulary; IWSLT; recurrent neural network; spoken language translation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854977