Title :
The integral decode: a smoothing technique for robust HMM-based speaker recognition
Author :
Roch, Marie ; Hurtig, Richard R.
Author_Institution :
Dept. of Comput. Sci., Iowa Univ., Iowa City, IA, USA
fDate :
7/1/2002 12:00:00 AM
Abstract :
Previous work by Merhav and Lee (1993) as well as others has emphasized that the conditions required to make the maximum a posteriori (MAP) decision rule an optimal decision rule for speech recognition do not hold and have proposed techniques based upon the adjustment of model parameters to improve speech recognition. In this article, we consider the problem of text-independent speaker recognition, and present a new model called the integral decode. The integral decode, like previous work in this area, attempts to compensate for the lack of conditions necessary to ensure optimality of the MAP decision rule in environments with corrupted observations and imperfect models. The integral decode is a smoothing operation in the feature space domain. A region of uncertainty is established about each noisy observation and an approximation of the integral is computed. The MAP decision rule is then applied to the smoothed likelihood estimates. In all tested conditions, the integral decode performs as well as or better than equivalent HMMs without integral decode.
Keywords :
approximation theory; decision theory; decoding; feature extraction; hidden Markov models; integral equations; noise; smoothing methods; speaker recognition; MAP decision rule; feature space domain; integral approximation; integral decode; maximum a posteriori decision rule; model parameters adjustment; noisy observation; optimal decision rule; robust HMM-based speaker recognition; smoothed likelihood estimates; smoothing operation; smoothing technique; speech recognition; text-independent speaker recognition; Decoding; Hidden Markov models; Performance evaluation; Robustness; Smoothing methods; Speaker recognition; Speech recognition; Testing; Uncertainty; Working environment noise;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2002.800558