Title :
An improved method for predicting fundamental frequency contour in mandarin text-to-speech system with a small corpus
Author :
Wang, Liang ; Zhu, Jie ; LV, Yao
Author_Institution :
Dept. of Electron. Eng., Shanghai Jiao Tong Univ., Shanghai, China
Abstract :
In this paper, a method to predict fundamental frequency contour is proposed for mandarin text-to-speech system with a small corpus. Above all, in order to avoid large modification to the speech clips, two kinds of corpus, tonal syllable corpus and high-frequency word corpus, are established. Afterwards, we apply two rules to predict the pitch contour of speech. Firstly, traditional Fujisaki model is modified to be fit in with our small corpus. Secondly, pitch jitter is simulated in a mode based on GMM. According to the fundamental frequency contour predicted by modified Fujisaki model and jitter model, the pitch of speech clips are adjusted by PSOLA algorithm, which can improve the prosody of synthesized speech to make it sound more natural. The method is effective for mandarin text-to-speech system based on a small corpus which is demonstrated by our experiments.
Keywords :
speech synthesis; GMM; Mandarin text-to-speech system; PSOLA algorithm; fundamental frequency contour prediction; high-frequency word corpus; modified Fujisaki model; pitch contour; pitch jitter; speech clips; speech synthesis; tonal syllable corpus; Fujisaki model; Text-to-speech; fundamental frequency contour; jitter; mandarin; small corpus;
Conference_Titel :
TENCON 2010 - 2010 IEEE Region 10 Conference
Conference_Location :
Fukuoka
Print_ISBN :
978-1-4244-6889-8
DOI :
10.1109/TENCON.2010.5686602