• DocumentCode
    3631371
  • Title

    Advances in syntax-based Malay-English speech translation

  • Author

    Bing Xiang;Bowen Zhou;Martin Cmejrek

  • Author_Institution
    IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA
  • fYear
    2009
  • Firstpage
    4801
  • Lastpage
    4804
  • Abstract
    In this paper, we present advanced techniques that improved the performance of IBM Malay-English speech translation system significantly. During this work, we generated linguistics-driven hierarchical rules to enhance the formal syntax-based translation model; designed an active learning approach with bi-directional translations that outperformed unsupervised training; utilized translation direction information in parallel training corpus to build direction-specific interpolated language models for machine translation. There is 20% relative improvement achieved in the translation performance through all these techniques. A state-of-the-art Malay speech recognition system was also established as one of the crucial modules in the rapidly developed Malay-English speech translation.
  • Keywords
    "Speech recognition","Natural languages","Machine learning","Bidirectional control","Automatic speech recognition","Data mining","Tagging","Training data","Semisupervised learning","Humans"
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    2379-190X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960705
  • Filename
    4960705