Title :
A simple and effective pitch re-estimation method for rich prosody and speaking styles in HMM-based speech synthesis
Author :
Cheng-Yuan Lin ; Chien-Hung Huang ; Chih-Chung Kuo
Author_Institution :
ITRI, Hsinchu, Taiwan
Abstract :
This paper proposes a novel way of controllable pitch re-estimation that can produce better pitch contour or provide diverse speaking styles for text-to-speech (TTS) systems. The method is composed of a pitch re-estimation model and a set of control parameters. The pitch re-estimation model is employed to reduce over-smoothing effects which is usually introduced by TTS training. The control parameters are designed to generate not only rich intonations but also speaking styles, e.g. a foreign accent or an excited tone. To verify the feasibility of the proposed method, we conducted experiments for both objective measures and subjective tests. Although the re-estimated pitch results in only slightly less prediction error for objective measure, it produces clearly better intonation for listening test. Moreover, the expressive speech can be generated successfully under the framework of controllable pitch re-estimation.
Keywords :
hidden Markov models; speech synthesis; HMM-based speech synthesis; TTS systems; TTS training; controllable pitch reestimation; effective pitch reestimation method; listening test; pitch reestimation model; prediction error; rich prosody styles; speaking styles; text-to-speech systems; Estimation; Hidden Markov models; Robots; Shape; Speech; Speech synthesis; Training; Pitch re-estimation; expressive speech; prosody control; text-to-speech;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
DOI :
10.1109/ISCSLP.2012.6423473