DocumentCode
2022600
Title
An all-phoneme ergodic HMM for unsupervised speaker adaptation
Author
Miyazawa, Yasvnaga
Author_Institution
ATR Interpreting Telephony Res. Lab., Soraku-gun, Kyoto, Japan
Volume
2
fYear
1993
fDate
27-30 April 1993
Firstpage
574
Abstract
The author proposes an all-phoneme ergodic HMM (hidden Markov model) that incorporates stochastic language constraints in unsupervised speaker adaptation. The proposed model consists of all-phoneme HMMs and interphoneme probabilities. It can be regarded as a rather large single ergodic HMM containing hidden states of all phonemes as well as intraphoneme and interphoneme transition probabilities. Since this model is a model of arbitrarily spoken words, the standard Baum-Welch reestimation algorithm can be used to train the whole ergodic model. In the experiments, only mean vectors of the state output probability densities are reestimated, and a vector field smoothing algorithm is used to enhance the statistical reliability. The proposed method was tested on phoneme and phrase recognition experiments with male reference and input speakers. A better performance than with the speaker-independent case was attained by using adaptation data shorter than three minutes.<>
Keywords
adaptive systems; constraint handling; hidden Markov models; reliability; speech recognition; stochastic systems; unsupervised learning; Baum-Welch reestimation algorithm; all-phoneme ergodic HMM; hidden Markov model; interphoneme probabilities; performance; phoneme recognition; phrase recognition; state output probability densities; statistical reliability; stochastic language constraints; unsupervised speaker adaptation; vector field smoothing algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location
Minneapolis, MN, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.1993.319372
Filename
319372
Link To Document