Title :
Trapping conversational speech: extending TRAP/tandem approaches to conversational telephone speech recognition
Author :
Morgan, Nelson ; Chen, Barry Y. ; Zhu, Qifeng ; Stolcke, Andreas
Author_Institution :
Int. Comput. Sci. Inst., Berkeley, CA, USA
Abstract :
Temporal patterns (TRAP) and tandem MLP/HMM approaches incorporate feature streams computed from longer time intervals than the conventional short-time analysis. These methods have been used for challenging small- and medium-vocabulary recognition tasks, such as Aurora and SPINE. Conversational telephone speech recognition is a difficult large-vocabulary task, with current systems giving incorrect output for 20-40% of the words, depending on the system complexity and test set. Training and test times for this problem also tend to be relatively long, making rapid development quite difficult. In this paper we report experiments with a reduced conversational speech task that led to the adoption of a number of engineering decisions for the design of an acoustic front end. We then describe our results with this front end on a full-vocabulary conversational telephone speech task. In both cases the front end yielded significant improvements over the baseline.
Keywords :
feature extraction; speech recognition; vocabulary; Aurora; SPINE; TRAP; acoustic front end; conversational telephone speech recognition; feature streams; large-vocabulary task; tandem MLP/HMM; temporal patterns; time intervals; vocabulary recognition; Acoustic testing; Acoustical engineering; Computer science; Design engineering; Hidden Markov models; Pattern analysis; Speech analysis; Speech recognition; System testing; Telephony;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326041