• DocumentCode
    417229
  • Title

    Probability based prosody model for unit selection

  • Author

    Ma, Xijun ; Zhang, Wei ; Zhu, Weibin ; Shi, Qin ; Jin, Ling

  • Author_Institution
    IBM China Res. Lab, China
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    Most modern text-to-speech (TTS) systems are unit selection style. In this kind of system, the predicted prosody values, such as pitch, duration and energy values for each synthesis unit, are important factors to conduct unit selection. We present a probability based prosody model in which the distribution of prosody values in a given context equivalent cluster is described by a Gaussian mixture model (GMM), and the distance between a candidate unit and the context equivalent cluster is defined by the GMM probability output. A novel framework for unit selection style TTS systems is derived from the model, and a series of experiments are done on the framework.
  • Keywords
    Gaussian processes; speech synthesis; statistical distributions; GMM probability output; Gaussian mixture model; TTS systems; context equivalent cluster; probability distribution; prosody model; text-to-speech systems; unit selection style; Context modeling; Fuzzy systems; Predictive models; Probability distribution; Speech synthesis; Statistics; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326069
  • Filename
    1326069