• DocumentCode
    180178
  • Title

    Translating TED speeches by recurrent neural network based translation model

  • Author

    Youzheng Wu ; Xinhu Hu ; Hori, Chiori

  • Author_Institution
    Spoken Language Commun. Lab., Kyoto, Japan
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    7098
  • Lastpage
    7102
  • Abstract
    This paper presents our recent progress on translating TED speeches1, a collection of public lectures covering a variety of topics. Specially, we use word-to-word alignment to compose translation units of bilingual tuples and present a recurrent neural network-based translation model (RNNTM) to capture long-span context during estimating translation probabilities of bilingual tuples. However, this RNNTM has severe data sparsity problem due to large tuple vocabulary and limited training data. Therefore, a factored RNNTM, which takes bilingual tuples in addition to source and target phrases of the tuples as input features, is proposed to partially address the problem. Our experimental results on the IWSLT2012 test sets show that the proposed models significantly improve the translation quality over state-of-the-art phrase-based translation systems.
  • Keywords
    language translation; natural language processing; recurrent neural nets; speech processing; TED speech translation; bilingual tuple; long-span context; public lecture collection; recurrent neural network; translation model; word-to-word alignment; Context; Context modeling; Mathematical model; Recurrent neural networks; Training data; Vocabulary; IWSLT; recurrent neural network; spoken language translation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854977
  • Filename
    6854977