Title :
Use of generation process model for synthesizing fundamental frequency contours in HMM-based speech synthesis
Author :
Hirose, Keikichi ; Hashimoto, Hiroya ; Ikeshima, J. ; Minematsu, Nobuaki
Author_Institution :
Dept. of Inf. & Commun. Eng., Univ. of Tokyo, Tokyo, Japan
Abstract :
Generation process model of fundamental frequency contours is ideal to represent global features of prosody. It is a command response model, where the commands have clear relations with linguistic and para/non linguistic information conveyed by the utterance. Therefore, by handling fundamental frequency contours in the framework of the generation process model, prosody control with increased flexibility comes possible in speech synthesis. Also, the model can be used to solve problems of HMM-based speech synthesis, which arise from frame-by-frame treatment of fundamental frequencies. Two ways are possible; before training and after generation processes. The former is to suppress unnatural fundamental frequency movements of speech for HMM training, and the latter is to reshape the fundamental frequency contours, generated by HMM-based speech synthesis. A method of prosody conversion is also developed, which views the model command differences between original and target styles. The method enables flexible control of fundamental frequency contours in speech synthesis.
Keywords :
computational linguistics; hidden Markov models; interference suppression; speech synthesis; HMM-based speech synthesis; command response model; frame-by-frame treatment; fundamental frequency contour synthesis; generation process model; global feature representation; para-non linguistic information; prosody control; prosody conversion; unnatural fundamental frequency movements suppression; HMM-based speech synthesis; flexible control of prosody; fundamental frequency contour; generation process model;
Conference_Titel :
Signal Processing (ICSP), 2012 IEEE 11th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2196-9
DOI :
10.1109/ICoSP.2012.6491554