Title :
Speaker normalization on conversational telephone speech
Author :
Wegmann, Steven ; McAllaster, Don ; Orloff, J. ; Peskin, Barbara
Author_Institution :
Dragon Syst. Inc., Newton, MA, USA
Abstract :
This paper reports on a simplified system for determining vocal tract normalization. Such normalization has led to significant gains in recognition accuracy by reducing variability among speakers and allowing the pooling of training data and the construction of sharper models. But standard methods for determining the warp scale have been extremely cumbersome, generally requiring multiple recognition passes. We present a new system for warp scale selection which uses a simple generic voiced speech model to rapidly select appropriate frequency scales. The selection is sufficiently streamlined that it can moved completely into the front-end processing. Using this system on a standard test of the Switchboard Corpus, we have achieved relative reductions in word error rates of 12% over unnormalized gender-independent models and 6% over our best unnormalized gender-dependent models
Keywords :
speech processing; speech recognition; telephony; Switchboard Corpus; conversational telephone speech; frequency scales; front-end processing; generic voiced speech model; recognition accuracy; speaker normalization; standard test; training data; unnormalized gender-independent models; vocal tract normalization; warp scale selection; word error rates; Cepstral analysis; Computational efficiency; Error analysis; Frequency; Linear discriminant analysis; Signal processing; Speech processing; System testing; Telephony; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
0-7803-3192-3
DOI :
10.1109/ICASSP.1996.541101