Developing high performance asr in the IBM multilingual speech-to-speech translation system

Author

Cui, Xiaodong ; Gu, Liang ; Xiang, Bing ; Zhang, Wei ; Gao, Yuqing

Author_Institution

T. J. Watson Res. Center, IBM, Yorktown Heights, NY

fYear

2008

fDate

March 31 2008-April 4 2008

Firstpage

5121

Lastpage

5124

Abstract

This paper presents our recent development of the real-time speech recognition component in the IBM English/Iraqi Arabic speech-to-speech translation system for the DARPA Transtac project. We describe the details of the acoustic and language modeling that lead to high recognition accuracy and noise robustness and give the performance of the system on the evaluation sets of spontaneous conversational speech. We also introduce the streaming decoding structure and several speedup techniques that achieves best recognition accuracy at about 0.3 x RT recognition speed.

Keywords

decoding; natural languages; speech recognition; IBM multilingual speech translation; acoustic modeling; language modeling; large vocabulary spontaneous speech recognition; noise robustness; streaming mode decoding; Acoustic noise; Automatic speech recognition; Decoding; Hidden Markov models; Linear discriminant analysis; Natural languages; Noise robustness; Real time systems; Speech enhancement; Speech recognition; discriminative training; large vocabulary spontaneous speech recognition; multilingual speech translation; noise robustness; streaming mode decoding;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location

Las Vegas, NV

ISSN

1520-6149

Print_ISBN

978-1-4244-1483-3

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2008.4518811

Filename

4518811