DocumentCode :
394335
Title :
Segment selection considering local degradation of naturalness in concatenative speech synthesis
Author :
Toda, Tomoki ; Kawai, Hisashi ; Tsuzaki, Masanori ; Shikano, Kiyohiro
Author_Institution :
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
Volume :
1
fYear :
2003
fDate :
6-10 April 2003
Abstract :
In this paper, we investigate the effect of using a novel cost, RMS (root mean square) cost, for segment selection for concatenative text-to-speech synthesis. The RMS cost is affected not only by the total degradation of naturalness but also by the local degradation of naturalness. From the results of experiments comparing this approach with segment selection based on a conventional average cost, it is found that: (1) in the segment selection based on the RMS cost a larger number of concatenations causing slight local degradation are performed in order to avoid concatenations causing greater local degradation; and (2) the effect of the RMS cost has little dependence on the size of the corpus. Moreover, we clarify that the naturalness of synthetic speech can be slightly improved by utilizing the RMS cost.
Keywords :
speech processing; speech synthesis; RMS cost; concatenative speech synthesis; local naturalness degradation; root mean square cost; segment selection; text-to-speech synthesis; Cities and towns; Cost function; Degradation; Information science; Laboratories; Natural languages; Root mean square; Speech synthesis; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1198876
Filename :
1198876
Link To Document :
بازگشت