• DocumentCode
    2065647
  • Title

    Multi-Layer F0 Modeling for HMM-Based Speech Synthesis

  • Author

    Wang, Cheng-Cheng ; Ling, Zhen-Hua ; Zhang, Bu-Fan ; Dai, Li-Rong

  • Author_Institution
    iFlytek Speech Lab., Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2008
  • fDate
    16-19 Dec. 2008
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This paper proposes a two-layer fundamental frequency (FO) modeling method for HMM-based parametric speech synthesis. The FO models are trained for each context- dependent phoneme in the conventional HMM-based speech synthesis system. Considering the super-segmental characteristics of FO features, an explicit syllable-layer FO model is introduced in this paper. At synthesis stage, the FO contour is generated by maximizing the combined likelihood functions of the phone-layer and syllable-layer FO models. The objective and subjective evaluation results in our experiments show that the proposed multi-layer FO modeling method can improve the performance of FO prediction for emotional speech synthesis.
  • Keywords
    hidden Markov models; maximum likelihood estimation; speech synthesis; HMM-based speech synthesis; maximum combined likelihood functions; multi-layer modeling; two-layer fundamental frequency modeling method; Context modeling; Frequency synthesizers; Hidden Markov models; Predictive models; Probability distribution; Spatial databases; Speech analysis; Speech recognition; Speech synthesis; Stress;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on
  • Conference_Location
    Kunming
  • Print_ISBN
    978-1-4244-2942-4
  • Electronic_ISBN
    978-1-4244-2943-1
  • Type

    conf

  • DOI
    10.1109/CHINSL.2008.ECP.44
  • Filename
    4730298