Title :
A maximum-likelihood approach to stochastic matching for robust speech recognition
Author :
Sankar, Ananth ; Lee, Chin-Hui
Author_Institution :
Speech Res. Dept., AT&T Bell Labs., Murray Hill, NJ, USA
fDate :
5/1/1996 12:00:00 AM
Abstract :
Presents a maximum-likelihood (ML) stochastic matching approach to decrease the acoustic mismatch between a test utterance and a given set of speech models so as to reduce the recognition performance degradation caused by distortions in the test utterance and/or the model set. We assume that the speech signal is modeled by a set of subword hidden Markov models (HMM) Λx. The mismatch between the observed test utterance Y and the models Λx can be reduced in two ways: 1) by an inverse distortion function Fν (.) that maps Y into an utterance X that matches better with the models Λx and 2) by a model transformation function G η(.) that maps Λx to the transformed model Λx that matches better with the utterance Y. We assume the functional form of the transformations Fν(.) or Gη(.) and estimate the parameters ν or η in a ML manner using the expectation-maximization (EM) algorithm. The choice of the form of Fν(.) or Gη(.) is based on prior knowledge of the nature of the acoustic mismatch. The stochastic matching algorithm operates only on the given test utterance and the given set of speech models, and no additional training data is required for the estimation of the mismatch prior to actual testing. Experimental results are presented to study the properties of the proposed algorithm and to verify the efficacy of the approach in improving the performance of a HMM-based continuous speech recognition system in the presence of mismatch due to different transducers and transmission channels
Keywords :
hidden Markov models; inverse problems; maximum likelihood estimation; optimisation; speech recognition; stochastic processes; HMM-based continuous speech recognition system; acoustic mismatch; expectation-maximization; hidden Markov models; inverse distortion function; maximum-likelihood approach; mismatch; model transformation function; recognition performance degradation; robust speech recognition; stochastic matching; test utterance; Acoustic distortion; Acoustic testing; Degradation; Hidden Markov models; Maximum likelihood estimation; Predistortion; Robustness; Speech recognition; Stochastic processes; Time of arrival estimation;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on