• DocumentCode
    417235
  • Title

    Modeling pronunciation variation for spontaneous speech synthesis

  • Author

    Werner, Steffen ; Wolff, Matthias ; Eichner, Matthias ; Hoffinann, R.

  • Author_Institution
    Lab. of Acoust. & Speech Commun., Dresden Univ. of Technol., Germany
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    Integration of pronunciation modeling into speech synthesis makes synthetic speech more natural and colloquial. Pronunciation variation as one observable effect in spontaneous speech is a step towards spontaneous speech synthesis. In the previous works (see Proc. ICASSP, vol.1, p.417-20, Orlando, FL, USA, 2002 and Proc. ICASSP, Hong Kong, PR China, Apr. 2003) we introduced different duration control methods in speech synthesis. These methods are based on the observation that words, which are very likely to occur in a given context are pronounced faster and less accurately than improbable ones (see D. Jurafsky et al., Proc. ICASSP, vol.2, p.801-4, Salt Lake City, USA, 2001). Therefore we use the probability of a word in its context to either control directly the local speaking rate, or to select appropriate pronunciation variants in order to change the local speaking rate. Extending these methods by a pronunciation sequence model, we involve knowledge about how well two subsequent variants fit together. Using this proposed algorithm we could further improve the natural and colloquial listening impressions.
  • Keywords
    probability; speech synthesis; stochastic processes; colloquial synthetic speech; duration control speech synthesis; local speaking rate; multiple duration control methods; natural synthetic speech; pronunciation lexicon; pronunciation sequence model; pronunciation variation; spoken word probability; spontaneous speech synthesis; stochastic Markov graph; variant sequence model; word boundary effects; Acoustics; Communication system control; Laboratories; Lattices; Natural languages; Oral communication; Probability; Speech synthesis; Stochastic processes; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326075
  • Filename
    1326075