Title :
On the phonetic structure of a large hidden Markov model
Author :
Pepper, David J. ; Clements, Mark A.
Author_Institution :
Sch. of Electr. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
It is shown that the structure of a large ergodic hidden Markov model (HMM) can be decomposed into a set of substructures representing the English phonemes. The large HMM, which is trained using a standard forward-backward algorithm, is shown to be organized in a way that reflects the phonetic nature of speech. It is shown that the states of the HMM can be classified in terms of a set of broad phonetic classes and that the spectra associated with the states are related to each state´s use in the phonetic models. The phonetic models are shown to have internal structures reflecting the acoustic nature of the individual phonemes. The large HMMs used in this study are trained using the continuous speech multi-speaker TIMIT database employing a continuous observation density training algorithm. On a subset of the database, with 80 male speakers used for training and a separate set of 24 speakers reserved for testing, the phonetic recognition system achieved a 52% recognition rate with 14% insertions
Keywords :
Markov processes; speech recognition; English phonemes; HMM; broad phonetic classes; continuous observation density training algorithm; continuous speech multi-speaker TIMIT database; internal structures; large ergodic hidden Markov model; male speakers; phoneme acoustic nature; phonetic recognition system; speech recognition; standard forward-backward algorithm; state associated spectra; Automatic speech recognition; Automatic testing; Cepstral analysis; Clustering algorithms; Databases; Hidden Markov models; Loudspeakers; Speech analysis; Speech recognition; Viterbi algorithm;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on
Conference_Location :
Toronto, Ont.
Print_ISBN :
0-7803-0003-3
DOI :
10.1109/ICASSP.1991.150377