DocumentCode
1064959
Title
Speaker adaptation using combined transformation and Bayesian methods
Author
Digal, Vassilios V. ; Neumeyer, Leonardo G.
Author_Institution
Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania, Greece
Volume
4
Issue
4
fYear
1996
fDate
7/1/1996 12:00:00 AM
Firstpage
294
Lastpage
300
Abstract
Adapting the parameters of a statistical speaker independent continuous-speech recognizer to the speaker and the channel can significantly improve the recognition performance and robustness of the system. In continuous mixture-density hidden Markov models the number of component densities is typically very large, and it may not be feasible to acquire a sufficient amount of adaptation data for robust maximum-likelihood estimates. To solve this problem, we have recently proposed a constrained estimation technique for Gaussian mixture densities. To improve the behavior of our adaptation scheme for large amounts of adaptation data, we combine it here with Bayesian techniques. We evaluate our algorithms on the large-vocabulary Wall Street Journal corpus for nonnative speakers of American English. The recognition error rate is approximately halved with only a small amount of adaptation data, and it approaches the speaker-independent accuracy achieved for native speakers
Keywords
Bayes methods; Gaussian processes; adaptive estimation; hidden Markov models; maximum likelihood estimation; speech recognition; American English; Bayesian method; Gaussian mixture densities; algorithms; constrained estimation technique; continuous mixture-density hidden Markov models; large-vocabulary Wall Street Journal corpus; native speakers; nonnative speakers; recognition error rate; recognition performance; robust maximum-likelihood estimates; robustness; speaker adaptation; statistical speaker independent continuous-speech recognizer; transformation method; Bayesian methods; Degradation; Error analysis; Hidden Markov models; Maximum likelihood estimation; Natural languages; Robustness; Speech recognition; Testing; Training data;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.506933
Filename
506933
Link To Document