Title :
Development of a text-to-speech system for Japanese based on waveform splicing
Author :
Kawa, H. ; Higuchi, Norio ; Simizu, Tohru ; Yamamoto, Seiichi
Author_Institution :
KDD Kamifukuoka R&D Labs., Saitama, Japan
Abstract :
A text-to-speech system for Japanese was developed based on waveform splicing. A stored unit is a sequence of phonemes segmented at vowel-consonant boundaries. Four and eight phoneme groups are distinguished for the preceding and succeeding phonemic environment, respectively. An inventory of waveform segments including frequently used 1020 units was constructed based on a statistical analysis of a text database consisting of 20 million phonemes. Each stored unit has, on average, 2.5 waveform segments with different fundamental frequency (F0) and phoneme duration. The F0 and phoneme duration are modified by a pitch synchronous overlap add (PSOLA) method. A time window which has a flat portion at its center (Tukey window) was adopted in place of an ordinary Hanning window. A preference test indicated that the Tukey window gives better quality when the F0 is lowered. The articulation score of an intelligibility test was 89.2%
Keywords :
natural languages; speech intelligibility; speech synthesis; statistical analysis; Hanning window; Japanese; Tukey window; acoustic processor; articulation score; fundamental frequency; intelligibility test; phoneme duration; phonemes; pitch synchronous overlap add method; preference test; statistical analysis; stored unit; text database; text-to-speech system; time window; vowel-consonant boundaries; waveform segments; waveform splicing; Acoustic waves; Frequency; Laboratories; Microcomputers; Research and development; Speech analysis; Speech processing; Speech synthesis; Splicing; Testing;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location :
Adelaide, SA
Print_ISBN :
0-7803-1775-0
DOI :
10.1109/ICASSP.1994.389230