DocumentCode :
394217
Title :
Hybrid modeling of PHMM and HMM for speech recognition
Author :
Ogawa, Tetsuji ; Kobayashi, Tetsunori
Author_Institution :
Dept. EECE, Waseda University, Tokyo, Japan
Volume :
1
fYear :
2003
fDate :
6-10 April 2003
Abstract :
A hybrid acoustic model of partly hidden Markov model (PHMM) and HMM is proposed. PHMM was proposed in our previous work to deal with the complicated temporal changes of acoustic features (Ogawa, T. and Kobayashi, T, Proc. ICSLP2002, p.2673-6, 2002). It can realized observation dependent behaviors in both observations and state transitions. It achieved good performance but some errors with different trends from HMM still remained. We have designed a new acoustic model on the basis of PHMM, in which the observation and state transition probabilities are defined by the geometric means of PHMM-based ones and HMM-based ones. In this framework, if a word hypothesis is given a low score by either PHMM or HMM, it almost loses the possibility of being a probable candidate. Since many errors are due to high-scores of incorrect categories rather than low-score of the correct category, this property contributes to reducing errors. Moreover, the proposed model is more stable than PHMM because the higher order statistics of PHMM, which is generally accurate but sometimes less reliable, are smoothed by the lower order statistics of HMM, which is not so accurate, but robust. Experimental results show the effectiveness of the proposed model: it reduces the word errors by 25% compared with HMM.
Keywords :
acoustic signal processing; hidden Markov models; higher order statistics; probability; speech recognition; HMM; acoustic feature temporal changes; geometric mean; higher order statistics; hybrid acoustic model; lower order statistics; observation probability; partly hidden Markov model; state transition probability; word errors; word hypothesis; Error correction; Hidden Markov models; Higher order statistics; Maximum likelihood estimation; Parameter estimation; Robustness; Solid modeling; Speech recognition; Stochastic processes; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1198736
Filename :
1198736
Link To Document :
بازگشت