• DocumentCode
    2016586
  • Title

    Automatic prosody prediction and detection with Conditional Random Field (CRF) models

  • Author

    Qian, Yao ; Wu, Zhizheng ; Ma, Xuezhe ; Soong, Frank

  • Author_Institution
    Microsoft Res. Asia, Beijing, China
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 3 2010
  • Firstpage
    135
  • Lastpage
    138
  • Abstract
    While the current TTS systems can deliver quite acceptable segmental quality of synthesized speech for voice user interface applications, its prosody is still perceived by users as “robotic” or not expressive. In this paper, we investigate how to improve TTS prosody prediction and detection. Conditional Random Field (CRF), a discriminative probabilistic model for the labeling the sequential data, is adopted. Rich syntactic and acoustic, contextual features are used in building the CRF models. Experiments performed on Boston University Radio Speech Corpus show that CRF models trained on our proposed rich contextual features can improve the accuracy of prosody prediction and detection in both speaker-dependent and speaker-independent cases. The performance is either comparable or better than the best reported results.
  • Keywords
    acoustics; computational linguistics; natural language interfaces; natural language processing; probabilistic logic; speaker recognition; speech processing; speech synthesis; Boston university radio speech corpus; TTS prosody prediction; automatic prosody prediction; conditional random field; discriminative probabilistic model; rich contextual feature; speech synthesis; text to speech systems; voice user interface; Acoustics; Feature extraction; Hidden Markov models; Silicon; Speech; Syntactics; Training; CRF; Prosody; Prosody event detection; Prosody label prediction; acoutic; syntatic;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-6244-5
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2010.5684835
  • Filename
    5684835