DocumentCode :
321486
Title :
Design of robust HMM speech recognizers using deterministic annealing
Author :
Rao, Ajit ; Rose, Kenneth ; Gersho, Allen
Author_Institution :
Dept. of Electr. & Comput. Eng., California Univ., Santa Barbara, CA, USA
fYear :
1997
fDate :
14-17 Dec 1997
Firstpage :
466
Lastpage :
473
Abstract :
We attack the difficult problem of optimizing a hidden Markov model (HMM) based speech recognizer to minimize its misclassification rate. In conventional HMM recognizer design, the training data is divided into subsets of identically labeled tokens and the HMM for each label is designed from the corresponding subset using a maximum likelihood (ML) objective. However, ML is a mismatched objective and ML design does not minimize the recognizer´s misclassification rate. The misclassification rate is difficult to optimize directly because the cost surface is riddled with shallow local minima that tend to trap naive descent methods. We propose an approach which is based on the powerful technique of deterministic annealing (DA) to minimize the misclassification cost while avoiding shallow local minima. In the DA approach, the classifier´s decision is randomized during design and its expected misclassification rate is minimized while enforcing a level of randomness measured by the Shannon entropy. The entropy constraint is gradually withdrawn (annealing) and in the limit, the cost function converges to the misclassification rate of a regular non-random recognizer. This algorithm is implementable by a low complexity forward-backward procedure similar to the Baum-Welch re-estimation used in ML design. Our experiments on speaker-independent isolated word speech recognition of clean and noise-corrupted utterances of letters of the difficult E-set=(b,c,d,e,g,p,t,v,z) demonstrate that DA-designed recognizers offer consistent and substantial improvements in accuracy over ML-designed recognizers
Keywords :
convergence; hidden Markov models; maximum likelihood estimation; minimisation; noise; performance evaluation; simulated annealing; speech recognition; HMM speech recognizers; Shannon entropy; convergence; cost function; deterministic annealing; forward-backward procedure; hidden Markov model; isolated word speech recognition; maximum likelihood; misclassification rate minimization; naive descent methods; noise; optimization; performance evaluation; randomness; shallow local minima; speaker-independent recognition; training data; Annealing; Cost function; Design optimization; Digital signal processing; Entropy; Hidden Markov models; Optimization methods; Robustness; Speech recognition; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-7803-3698-4
Type :
conf
DOI :
10.1109/ASRU.1997.659125
Filename :
659125
Link To Document :
بازگشت