• DocumentCode
    1316401
  • Title

    Product of Experts for Statistical Parametric Speech Synthesis

  • Author

    Zen, Heiga ; Gales, Mark J F ; Nankaku, Yoshihiko ; Tokuda, Keiichi

  • Author_Institution
    Nagoya Inst. of Technol., Nagoya, Japan
  • Volume
    20
  • Issue
    3
  • fYear
    2012
  • fDate
    3/1/2012 12:00:00 AM
  • Firstpage
    794
  • Lastpage
    805
  • Abstract
    Multiple acoustic models are often combined in statistical parametric speech synthesis. Both linear and non-linear functions of an observation sequence are used as features to be modeled. This paper shows that this combination of multiple acoustic models can be expressed as a product of experts (PoE); the likelihoods from the models are scaled, multiplied together, and then normalized. Normally these models are individually trained and only combined at the synthesis stage. This paper discusses a more consistent PoE framework where the models are jointly trained. A training algorithm for PoEs based on linear feature functions and Gaussian experts is derived by generalizing the training algorithm for trajectory HMMs. However for non-linear feature functions or non-Gaussian experts this is not possible, so a scheme based on contrastive divergence learning is described. Experimental results show that the PoE framework provides both a mathematically elegant way to train multiple acoustic models jointly and significant improvements in the quality of the synthesized speech.
  • Keywords
    acoustic signal processing; hidden Markov models; speech synthesis; PoE framework; acoustic model; contrastive divergence learning; nonGaussian expert; nonlinear feature function; product of experts; statistical parametric speech synthesis; training algorithm; trajectory HMM; Acoustics; Adaptation models; Hidden Markov models; Speech; Speech synthesis; Trajectory; Product of experts (PoE); statistical parametric speech synthesis; trajectory hidden Markov model (HMM);
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2011.2165280
  • Filename
    6012516