DocumentCode
180178
Title
Translating TED speeches by recurrent neural network based translation model
Author
Youzheng Wu ; Xinhu Hu ; Hori, Chiori
Author_Institution
Spoken Language Commun. Lab., Kyoto, Japan
fYear
2014
fDate
4-9 May 2014
Firstpage
7098
Lastpage
7102
Abstract
This paper presents our recent progress on translating TED speeches1, a collection of public lectures covering a variety of topics. Specially, we use word-to-word alignment to compose translation units of bilingual tuples and present a recurrent neural network-based translation model (RNNTM) to capture long-span context during estimating translation probabilities of bilingual tuples. However, this RNNTM has severe data sparsity problem due to large tuple vocabulary and limited training data. Therefore, a factored RNNTM, which takes bilingual tuples in addition to source and target phrases of the tuples as input features, is proposed to partially address the problem. Our experimental results on the IWSLT2012 test sets show that the proposed models significantly improve the translation quality over state-of-the-art phrase-based translation systems.
Keywords
language translation; natural language processing; recurrent neural nets; speech processing; TED speech translation; bilingual tuple; long-span context; public lecture collection; recurrent neural network; translation model; word-to-word alignment; Context; Context modeling; Mathematical model; Recurrent neural networks; Training data; Vocabulary; IWSLT; recurrent neural network; spoken language translation;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6854977
Filename
6854977
Link To Document