DocumentCode :
1295089
Title :
A Markov random field approach to Bayesian speaker adaptation
Author :
Shahshahani, Ben M.
Author_Institution :
Speech Bus. Unit, IBM Corp., Boca Raton, FL, USA
Volume :
5
Issue :
2
fYear :
1997
fDate :
3/1/1997 12:00:00 AM
Firstpage :
183
Lastpage :
191
Abstract :
Speaker adaptation through Bayesian learning methodology is studied in this paper. In order to utilize the cross allophone correlations, a Markov random field (MRF) model is proposed as the joint prior distribution of the mean vectors of the allophones. Neighborhoods are defined as pairs of parameters between which strong correlations have been observed previously. Maximum a posteriori estimates of the mean vectors are obtained through an iterative optimization technique that converges to the global maximum of the posterior distribution. This process is similar to a recursive prediction of the parameters, where at each iteration each parameter is estimated by a weighted sum of two terms, the first predicted by the neighbors and the second by the samples. Further Bayesian smoothing of the output distributions is carried out by utilizing some simplifications on the functional forms of the marginal posterior distributions. The proposed method is fast, consuming only a few CPU minutes for processing hundreds of sentences from a new speaker on an IBM RS6000 Model 580 system. Experimental results show rapid improvement of recognition accuracy
Keywords :
Bayes methods; Markov processes; convergence of numerical methods; correlation methods; iterative methods; learning (artificial intelligence); maximum likelihood estimation; optimisation; random processes; smoothing methods; speech recognition; Bayesian learning; Bayesian smoothing; Bayesian speaker adaptation; IBM RS6000 Model 580 system; Markov random field approach; cross allophone correlations; functional forms; global maximum; iterative optimization technique; joint prior distribution; maximum a posteriori estimates; neighborhoods; output distributions; posterior distribution; recursive prediction; weighted sum; Bayesian methods; Data mining; Markov random fields; Maximum a posteriori estimation; Parameter estimation; Recursive estimation; Smoothing methods; Speech recognition; Training data; Vocabulary;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.554780
Filename :
554780
Link To Document :
بازگشت