• DocumentCode
    2910575
  • Title

    Character-Level System Combination: An Empirical Study for English-to-Chinese Spoken Language Translation

  • Author

    Du, Jinhua

  • Author_Institution
    Fac. of Autom. & Inf. Eng, Xi´´an Univ. of Technol., Xi´´an, China
  • fYear
    2011
  • fDate
    15-17 Nov. 2011
  • Firstpage
    181
  • Lastpage
    184
  • Abstract
    This paper proposes a character-level system combination strategy for English -- Chinese spoken language translation. For languages like Chinese that the word boundaries are not orthographically marked, word segmentation which segments a Chinese sentence into a sequence of words, is often required for many Natural Language Processing tasks. In this paper we evaluate the impact of segmentation (spoken data) on the performance of system combination, and show that using inappropriate segmentation in system combination can result in inferior performance compared to single systems. We further demonstrate that using characters as basic translation unit in system combination on IWSLT ASR translation task leads to significant gains in translation quality in terms of BLEU and NIST scores.
  • Keywords
    language translation; natural language processing; BLEU score; English-to-Chinese spoken language translation; IWSLT ASR translation task; NIST score; character-level system combination; natural language processing task; spoken data segmentation; translation quality; word boundary; word segmentation; Accuracy; Decoding; Heuristic algorithms; Hidden Markov models; Measurement; NIST; Training data; Character-level; Chinese Spoken Language Translation; System Combination;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asian Language Processing (IALP), 2011 International Conference on
  • Conference_Location
    Penang
  • Print_ISBN
    978-1-4577-1733-8
  • Type

    conf

  • DOI
    10.1109/IALP.2011.47
  • Filename
    6121498