Title :
Acoustic recognition component of an 86000-word speech recognizer
Author :
Deng, L. ; Gupta, V. ; Lennig, M. ; Kenny, P. ; Mermelstein, P.
Author_Institution :
INRS-Telecommun., Montreal, Que., Canada
Abstract :
Recent results obtained with a hidden Markov model (HMM)-based acoustic recognizer using a virtually unlimited vocabulary (86000 words) to perform speaker-dependent isolated-word recognition are described. The task domain of this recognizer is quite general, consisting of paragraphs read from various newspapers, books, and magazines. The results of a comparative acoustic recognition study using various types of HMMs and various amounts of training data (from 700 to about 4000 words) are presented. The models explored include context-dependent allophonic HMMs (including generalized diphone and triphone models with unimodal Gaussian output densities) and context-independent phonemic HMMs (using either unimodal or mixture densities). Experimental results indicate that phonemic HMMs with many components in the mixture output densities provide the highest acoustic recognition accuracy. The acoustic recognition accuracy for a total of about 7000 test words spoken by four male and five female speakers is 82%. Recognition accuracy after application of the language model increases to 92%
Keywords :
Markov processes; speech recognition; HMM; acoustic recognition accuracy; context-dependent allophonic HMM; context-independent phonemic HMM; hidden Markov model; mixture HMM; speaker-dependent isolated-word recognition; unimodel phonemic HMM; unlimited vocabulary; Acoustic testing; Books; Context modeling; Gaussian distribution; Hidden Markov models; Loudspeakers; Microwave integrated circuits; Speech analysis; Speech recognition; Training data; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on
Conference_Location :
Albuquerque, NM
DOI :
10.1109/ICASSP.1990.115896