DocumentCode :
2801358
Title :
Statistical parametric speech synthesis based on product of experts
Author :
Zen, Heiga ; Gales, Mark J F ; Nankaku, Yoshihiko ; Tokuda, Keiichi
Author_Institution :
Cambridge Res. Lab., Toshiba Res. Eur. Ltd., Cambridge, UK
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4242
Lastpage :
4245
Abstract :
Multiple-level acoustic models (AMs) are often combined in statistical parametric speech synthesis. Both linear and non-linear functions of the observation sequence are used as features in these AMs. This combination of multiple-level AMs can be expressed as a product of experts (PoE); the likelihoods from the AMs are scaled, multiplied together and then normalized. Currently these multiple-level AMs are individually trained and only combined at the synthesis stage. This paper discusses a more consistent PoE framework where the AMs are jointly trained. A generalization of trajectory HMM training can be used for multiple-level Gaussian AMs based on linear functions. However for the non-linear case this is not possible, so a scheme based on contrastive divergence learning is described. Experimental results show that the proposed technique provides both a mathematically elegant way to train multiple-level AMs and statistically significant improvements in the quality of synthesized speech.
Keywords :
acoustic signal processing; hidden Markov models; learning (artificial intelligence); nonlinear functions; speech synthesis; statistical analysis; Gaussian AM; PoE; contrastive divergence learning; linear function; multiple level acoustic models; nonlinear function; product of experts; statistical parametric speech synthesis; trajectory HMM; Computer science; Data mining; Degradation; Europe; Hidden Markov models; Laboratories; Robustness; Speech synthesis; Training data; Vocoders; Statistical parametric speech synthesis; product of experts; trajectory HMM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495691
Filename :
5495691
Link To Document :
بازگشت