Title :
Tandem acoustic modeling in large-vocabulary recognition
Author :
Ellis, Daniel P W ; Singh, Rita ; Sivadas, Sunil
Author_Institution :
Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
Abstract :
In the tandem approach to modeling the acoustic signal, a neural-net preprocessor is first discriminatively trained to estimate posterior probabilities across a phone set. These are then used as feature inputs for a conventional hidden Markov model (HMM) based speech recognizer, which relearns the associations to subword units. We apply the tandem approach to the data provided for the first Speech in Noisy Environments (SPINE1) evaluation conducted by the Naval Research Laboratory (NRL) in August 2000. In our previous experience with the ETSI Aurora noisy digits (a small-vocabulary, high-noise task) the tandem approach achieved error-rate reductions of over 50% relative to the HMM baseline. For SPINE1, a larger task involving more spontaneous speech, we find that, when context-independent models are used, the tandem features continue to result in large reductions in word-error rates relative to those achieved by systems using standard MFC or PLP features. However, these improvements do not carry over to context-dependent models. This may be attributable to several factors which are discussed in the paper
Keywords :
hidden Markov models; learning (artificial intelligence); neural nets; speech recognition; ETSI Aurora noisy digits; HMM based speech recognizer; SPINE1 evaluation; Speech in Noisy Environments evaluation; acoustic signal; context-independent models; hidden Markov model based speech recognizer; large-vocabulary speech recognition; neural-net preprocessor; phone set; posterior probabilities; spontaneous speech; tandem acoustic modeling; Acoustic noise; Context modeling; Hidden Markov models; Military communication; Neural networks; Signal to noise ratio; Speech recognition; Telecommunication standards; US Department of Transportation; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
0-7803-7041-4
DOI :
10.1109/ICASSP.2001.940881