DocumentCode :
394214
Title :
Recognition method with parametric trajectory generated from mixture distribution HMMs
Author :
Minami, Yasuhiro ; McDermott, Erik ; Nakamura, Atsushi ; Katagiri, Shigem
Author_Institution :
Speech Open Lab., NTT Corp., Kyoto, Japan
Volume :
1
fYear :
2003
fDate :
6-10 April 2003
Abstract :
We have proposed a new speech recognition technique that generates a speech trajectory from HMMs by maximizing the likelihood of the trajectory, while accounting for the relation between the cepstrum and the dynamic cepstrum coefficients. This method has the major advantage that the relation, which is ignored in conventional speech recognition, is directly used in the speech recognition phase. This paper describes an extension of the method for dealing with HMMs whose distributions are mixture Gaussian distributions. The method chooses the sequence of Gaussian distributions by selecting the best Gaussian distribution in the state during Viterbi decoding. Speaker-independent speech recognition experiments were carried out. The proposed method obtained an 18.2% reduction in error rate for the task, proving that the proposed method is effective even for Gaussian mixture HMMs.
Keywords :
Gaussian distribution; Viterbi decoding; cepstral analysis; hidden Markov models; maximum likelihood estimation; speech recognition; HMMs; Viterbi decoding; cepstrum; error rate; mixture Gaussian distributions; speech recognition technique; speech trajectory; Cepstral analysis; Cepstrum; Error analysis; Gaussian distribution; Hidden Markov models; Laboratories; Loudspeakers; Polynomials; Speech recognition; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1198732
Filename :
1198732
Link To Document :
بازگشت