• DocumentCode
    3124637
  • Title

    Pitch accent detection and prediction with DCT features and CRF model

  • Author

    Wenping Hu ; Yao Qian ; Soong, Frank K.

  • Author_Institution
    Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2012
  • fDate
    5-8 Dec. 2012
  • Firstpage
    266
  • Lastpage
    270
  • Abstract
    Automatic detection/prediction of pitch accent, which determines the existence of prominent syllable of a word and its corresponding pitch accent pattern, is crucial in making expressive Text-To-Speech (TTS) synthesis. To train a model to detect and predict pitch accent usually requires a large amount of annotated training data to be manually labeled by phonetically trained language experts, which is both time consuming and costly. In this paper, we propose a semi-automatic algorithm to do pitch accent modeling, where the existence of accentuation in the training data is labeled at the word level by native speaker (i.e., not phonetically trained language experts) and the type of a pitch accent is automatically detected with its vector quantized DCT coefficient patterns. A cascaded, two-stage approach, which separates predicting the pitch accent existence and determining corresponding pitch accent type, is proposed to process any unrestricted text input with Conditional Random Field (CRF) trained models. The evaluation results show that the new approach outperforms the conventional, single stage approach.
  • Keywords
    discrete cosine transforms; speech synthesis; CRF model; DCT features; conditional random field; pitch accent detection; pitch accent modeling; pitch accent pattern; pitch accent prediction; text-to-speech synthesis; vector quantized DCT coefficient patterns; Accuracy; Discrete cosine transforms; Hidden Markov models; Predictive models; Speech; Stress; Training; CRF; DCT; F0 contour; LBG; Pitch accent; Prosody detection and prediction; ToBI;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
  • Conference_Location
    Kowloon
  • Print_ISBN
    978-1-4673-2506-6
  • Electronic_ISBN
    978-1-4673-2505-9
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2012.6423504
  • Filename
    6423504