DocumentCode
3631371
Title
Advances in syntax-based Malay-English speech translation
Author
Bing Xiang;Bowen Zhou;Martin Cmejrek
Author_Institution
IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA
fYear
2009
Firstpage
4801
Lastpage
4804
Abstract
In this paper, we present advanced techniques that improved the performance of IBM Malay-English speech translation system significantly. During this work, we generated linguistics-driven hierarchical rules to enhance the formal syntax-based translation model; designed an active learning approach with bi-directional translations that outperformed unsupervised training; utilized translation direction information in parallel training corpus to build direction-specific interpolated language models for machine translation. There is 20% relative improvement achieved in the translation performance through all these techniques. A state-of-the-art Malay speech recognition system was also established as one of the crucial modules in the rapidly developed Malay-English speech translation.
Keywords
"Speech recognition","Natural languages","Machine learning","Bidirectional control","Automatic speech recognition","Data mining","Tagging","Training data","Semisupervised learning","Humans"
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
ISSN
1520-6149
Print_ISBN
978-1-4244-2353-8
Electronic_ISBN
2379-190X
Type
conf
DOI
10.1109/ICASSP.2009.4960705
Filename
4960705
Link To Document