Title :
Improving wordspotting performance with artificially generated data
Author :
Chang, Eric I. ; Lippmann, Richard P.
Author_Institution :
Nuance Commun., Menlo Park, CA, USA
Abstract :
Lack of training data is a major problem that limits the performance of speech recognizers. Performance can often only be improved by expensive collection of data from many different talkers. This paper demonstrates that artificially transformed speech can increase the variability of training data and increase the performance of a wordspotter without additional expensive data collection. This approach was shown to be effective on a high-performance whole-word wordspotter on the Switchboard Credit Card database. The proposed approach used in combination with a discriminative training approach increased the figure of merit of the wordspotting system by 9.4% percentage points (62.5% to 71.9%). The increase in performance provided by artificially transforming speech was roughly equivalent to the increase that would have been provided by doubling the amount of training data. The performance of the wordspotter was also compared to that of human listeners who were able to achieve lower error rates because of improved consonant recognition
Keywords :
hidden Markov models; speech processing; speech recognition; HMM; Switchboard Credit Card database; artificially generated data; artificially transformed speech; consonant recognition; discriminative training; error rates; figure of merit; high performance whole word wordspotter; human listeners; speech recognizers; talker variability; training data; wordspotting performance; wordspotting system; Credit cards; Databases; Error analysis; Hidden Markov models; Humans; Laboratories; Speech processing; Speech recognition; Training data; Viterbi algorithm;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
0-7803-3192-3
DOI :
10.1109/ICASSP.1996.541149