Title :
Multivariate-state hidden Markov models for simultaneous transcription of phones and formants
Author :
Hasegawa-Johnson, Mark
Author_Institution :
Dept. of Electr. & Comput. Eng., Illinois Univ., Urbana, IL, USA
Abstract :
A multivariate-state HMM-an HMM with a vector state variable-can be used to find jointly optimal phonetic and formant transcriptions of an utterance. The complexity of searching a multivariate state space using the Baum-Welch algorithm is substantial, but may be significantly reduced if the formant frequencies are assumed to be conditionally independent given knowledge of the phone. Operating with a known phonetic transcription, the multivariate-state model can provide a maximum a posteriori formant trajectory, complete with confidence limits on each of the formant frequency measurements. The model can also be used as a phonetic classifier by adding the probabilities of all possible formant trajectories. A test system is described which requires only nine trainable parameters per formant per phonetic state: five parameters to model formant transitions, and four to model spectral observations. Further simplifications were achieved through parameter tying
Keywords :
hidden Markov models; maximum likelihood estimation; pattern classification; speech recognition; Baum-Welch algorithm; formant frequencies; formants; maximum a posteriori formant trajectory; multivariate state space; multivariate-state hidden Markov models; phones; phonetic classifier; phonetic transcription; spectral observations; transcription; vector state variable; Costs; Equations; Frequency measurement; Hidden Markov models; Linear predictive coding; Speech recognition; State-space methods; System testing; Vector quantization; Viterbi algorithm;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.861822