DocumentCode
417237
Title
Scaling of waveform segments along the time axis for concatenative speech synthesis
Author
Nishizawa, Nobuyuki ; Kawai, Hisashi
Author_Institution
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
Waveform scaling along the time axis is introduced as a pitch and duration conversion method for concatenative speech synthesis. This method will affect F0, duration and spectrum, although no degradation of the naturalness is caused when the scaling ratio is nearly 1. In corpus-based concatenative speech synthesis, when there are many segment candidates with various F0 values or durations, excessive scaling may be unnecessary. The result of experiments indicated that the difference in F0 and duration between the target and a selected segment became smaller. However, it also showed that the conventional cost function in selection cannot represent the degradation of naturalness by spectral distortion, and that the scaling range without degradation may not be enough for the pitch conversion required in our synthesizer. These problems should be improved by wider range scaling with a new cost function that also considers the degradation.
Keywords
hidden Markov models; speech synthesis; waveform analysis; HMM; concatenative speech synthesis; corpus-based speech synthesis; duration conversion method; hidden Markov model; naturalness degradation cost function; noninteger-ratio sampling frequency converter; pitch conversion method; sampling frequency conversion; synthetic speech quality; text to speech synthesis; time axis; waveform scaling; waveform segment scaling; Acoustic signal processing; Cost function; Degradation; Laboratories; Natural languages; Robustness; Signal generators; Signal synthesis; Speech synthesis; Synthesizers;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326077
Filename
1326077
Link To Document