DocumentCode
321486
Title
Design of robust HMM speech recognizers using deterministic annealing
Author
Rao, Ajit ; Rose, Kenneth ; Gersho, Allen
Author_Institution
Dept. of Electr. & Comput. Eng., California Univ., Santa Barbara, CA, USA
fYear
1997
fDate
14-17 Dec 1997
Firstpage
466
Lastpage
473
Abstract
We attack the difficult problem of optimizing a hidden Markov model (HMM) based speech recognizer to minimize its misclassification rate. In conventional HMM recognizer design, the training data is divided into subsets of identically labeled tokens and the HMM for each label is designed from the corresponding subset using a maximum likelihood (ML) objective. However, ML is a mismatched objective and ML design does not minimize the recognizer´s misclassification rate. The misclassification rate is difficult to optimize directly because the cost surface is riddled with shallow local minima that tend to trap naive descent methods. We propose an approach which is based on the powerful technique of deterministic annealing (DA) to minimize the misclassification cost while avoiding shallow local minima. In the DA approach, the classifier´s decision is randomized during design and its expected misclassification rate is minimized while enforcing a level of randomness measured by the Shannon entropy. The entropy constraint is gradually withdrawn (annealing) and in the limit, the cost function converges to the misclassification rate of a regular non-random recognizer. This algorithm is implementable by a low complexity forward-backward procedure similar to the Baum-Welch re-estimation used in ML design. Our experiments on speaker-independent isolated word speech recognition of clean and noise-corrupted utterances of letters of the difficult E-set=(b,c,d,e,g,p,t,v,z) demonstrate that DA-designed recognizers offer consistent and substantial improvements in accuracy over ML-designed recognizers
Keywords
convergence; hidden Markov models; maximum likelihood estimation; minimisation; noise; performance evaluation; simulated annealing; speech recognition; HMM speech recognizers; Shannon entropy; convergence; cost function; deterministic annealing; forward-backward procedure; hidden Markov model; isolated word speech recognition; maximum likelihood; misclassification rate minimization; naive descent methods; noise; optimization; performance evaluation; randomness; shallow local minima; speaker-independent recognition; training data; Annealing; Cost function; Design optimization; Digital signal processing; Entropy; Hidden Markov models; Optimization methods; Robustness; Speech recognition; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
Conference_Location
Santa Barbara, CA
Print_ISBN
0-7803-3698-4
Type
conf
DOI
10.1109/ASRU.1997.659125
Filename
659125
Link To Document