• DocumentCode
    835933
  • Title

    System Combination for Machine Translation of Spoken and Written Language

  • Author

    Matusov, Evgeny ; Leusch, Gregor ; Banchs, Rafael E. ; Bertoldi, Nicola ; Dechelotte, Daniel ; Federico, Marcello ; Kolss, Muntsin ; Lee, Young-Suk ; Marino, Jose B. ; Paulik, Matthias ; Roukos, Salim ; Schwenk, Holger ; Ney, Hermann

  • Author_Institution
    RWTH Aachen Univ., Aachen
  • Volume
    16
  • Issue
    7
  • fYear
    2008
  • Firstpage
    1222
  • Lastpage
    1237
  • Abstract
    This paper describes an approach for computing a consensus translation from the outputs of multiple machine translation (MT) systems. The consensus translation is computed by weighted majority voting on a confusion network, similarly to the well-established ROVER approach of Fiscus for combining speech recognition hypotheses. To create the confusion network, pairwise word alignments of the original MT hypotheses are learned using an enhanced statistical alignment algorithm that explicitly models word reordering. The context of a whole corpus of automatic translations rather than a single sentence is taken into account in order to achieve high alignment quality. The confusion network is rescored with a special language model, and the consensus translation is extracted as the best path. The proposed system combination approach was evaluated in the framework of the TC-STAR speech translation project. Up to six state-of-the-art statistical phrase-based translation systems from different project partners were combined in the experiments. Significant improvements in translation quality from Spanish to English and from English to Spanish in comparison with the best of the individual MT systems were achieved under official evaluation conditions.
  • Keywords
    language translation; natural languages; speech recognition; ROVER approach; TC-STAR speech translation project; consensus translation; machine translation; natural language; speech recognition hypotheses; spoken language; state-of-the-art statistical phrase-based translation system; statistical alignment algorithm; written language; Automatic speech recognition; Computer networks; Iterative methods; Lattices; Natural languages; Speech analysis; Speech processing; Speech recognition; Text processing; Voting; machine translation; natural languages; speech processing; text processing;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2008.914970
  • Filename
    4599393