• DocumentCode
    417237
  • Title

    Scaling of waveform segments along the time axis for concatenative speech synthesis

  • Author

    Nishizawa, Nobuyuki ; Kawai, Hisashi

  • Author_Institution
    ATR Spoken Language Translation Res. Labs., Kyoto, Japan
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    Waveform scaling along the time axis is introduced as a pitch and duration conversion method for concatenative speech synthesis. This method will affect F0, duration and spectrum, although no degradation of the naturalness is caused when the scaling ratio is nearly 1. In corpus-based concatenative speech synthesis, when there are many segment candidates with various F0 values or durations, excessive scaling may be unnecessary. The result of experiments indicated that the difference in F0 and duration between the target and a selected segment became smaller. However, it also showed that the conventional cost function in selection cannot represent the degradation of naturalness by spectral distortion, and that the scaling range without degradation may not be enough for the pitch conversion required in our synthesizer. These problems should be improved by wider range scaling with a new cost function that also considers the degradation.
  • Keywords
    hidden Markov models; speech synthesis; waveform analysis; HMM; concatenative speech synthesis; corpus-based speech synthesis; duration conversion method; hidden Markov model; naturalness degradation cost function; noninteger-ratio sampling frequency converter; pitch conversion method; sampling frequency conversion; synthetic speech quality; text to speech synthesis; time axis; waveform scaling; waveform segment scaling; Acoustic signal processing; Cost function; Degradation; Laboratories; Natural languages; Robustness; Signal generators; Signal synthesis; Speech synthesis; Synthesizers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326077
  • Filename
    1326077