DocumentCode
1502186
Title
A maximum a posteriori approach to speaker adaptation using the trended hidden Markov model
Author
Chengalvarayan, Rathinavelu ; Deng, Li
Author_Institution
Dept. of Electr. & Comput. Eng., Waterloo Univ., Ont., Canada
Volume
9
Issue
5
fYear
2001
fDate
7/1/2001 12:00:00 AM
Firstpage
549
Lastpage
557
Abstract
A formulation of the maximum a posteriori (MAP) approach to speaker adaptation is presented with use of the trended or nonstationary-state hidden Markov model (HMM), where the Gaussian means in each HMM state are characterized by time-varying polynomial trend functions of the state sojourn time. Assuming uncorrelatedness among the polynomial coefficients in the trend functions, we have obtained analytical results for the MAP estimates of the parameters including time-varying means and time-invariant precisions. We have implemented a speech recognizer based on these results in speaker adaptation experiments using the TI46 corpora. The experimental evaluation demonstrates that the trended HMM, with use of either the linear or the quadratic polynomial trend function, consistently outperforms the conventional, stationary-state HMM. The evaluation also shows that the unadapted, speaker-independent models are outperformed by the models adapted by the MAP procedure under supervision with as few as a single adaptation token. Further, adaptation of polynomial coefficients alone is shown to be better than adapting both polynomial coefficients and precision matrices when fewer than four adaptation tokens are used, while the reverse is found with a greater number of adaptation tokens
Keywords
hidden Markov models; parameter estimation; polynomials; speech recognition; Gaussian means; HMM; MAP estimates; TI46 corpora; linear polynomial trend function; maximum a posteriori approach; nonstationary-state hidden Markov model; parameter estimation; polynomial coefficients; precision matrices; quadratic polynomial trend function; speaker adaptation; speaker-independent models; speech recognizer; state sojourn time; stationary-state HMM; time-invariant precisions; time-varying means; time-varying polynomial trend functions; trended hidden Markov model; Bayesian methods; Councils; Helium; Hidden Markov models; Maximum likelihood estimation; Parameter estimation; Polynomials; Speech recognition; Training data; Viterbi algorithm;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.928919
Filename
928919
Link To Document