DocumentCode
3203977
Title
A Chinese text to speech system based on TD-PSOLA
Author
Zhu, Yunbo ; Zhao, Li ; Xu, Yunbiao ; Niimi, Yasuhisa
Author_Institution
Dept. of radio engineering, Southeast Univ., Nanjing, China
Volume
1
fYear
2002
fDate
28-31 Oct. 2002
Firstpage
204
Abstract
The paper presents the implementation of a Chinese text to speech (hereafter called TTS) system based on the Time Domain Pitch-Synchronous OverLap-Add approach (hereafter called TD-PSOLA). In order to get natural synthesized speech, it is necessary to precisely extract pitch-marks for each monosyllabic speech unit, to predict the length of syllables in a sentence to be synthesized and to generate F0-contours for their final portion. In the paper, we concentrate on the last two issues to propose a scheme to predict syllable duration. which gives an accuracy of about 18% of the relative length error, and to generate F0-contour. To synthesize a certain tonal syllable with a desired duration, a new pattern-scaling algorithm was proposed. The preliminary hearing test showed the intelligibility and naturalness of synthetic speech were good.
Keywords
feature extraction; prediction theory; speech intelligibility; speech synthesis; time-domain synthesis; Chinese text to speech system; F0-contour generation; TD-PSOLA; TTS system; Time Domain Pitch-Synchronous OverLap-Add approach; monosyllabic speech unit; natural synthesized speech; pattern-scaling algorithm; pitch-mark extraction; speech intelligibility; syllable duration; syllable length prediction; synthetic speech naturalness; Auditory system; Computer errors; Costs; Information science; Natural languages; Signal synthesis; Speech analysis; Speech synthesis; System performance; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
Print_ISBN
0-7803-7490-8
Type
conf
DOI
10.1109/TENCON.2002.1181250
Filename
1181250
Link To Document