DocumentCode
3124637
Title
Pitch accent detection and prediction with DCT features and CRF model
Author
Wenping Hu ; Yao Qian ; Soong, Frank K.
Author_Institution
Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
fYear
2012
fDate
5-8 Dec. 2012
Firstpage
266
Lastpage
270
Abstract
Automatic detection/prediction of pitch accent, which determines the existence of prominent syllable of a word and its corresponding pitch accent pattern, is crucial in making expressive Text-To-Speech (TTS) synthesis. To train a model to detect and predict pitch accent usually requires a large amount of annotated training data to be manually labeled by phonetically trained language experts, which is both time consuming and costly. In this paper, we propose a semi-automatic algorithm to do pitch accent modeling, where the existence of accentuation in the training data is labeled at the word level by native speaker (i.e., not phonetically trained language experts) and the type of a pitch accent is automatically detected with its vector quantized DCT coefficient patterns. A cascaded, two-stage approach, which separates predicting the pitch accent existence and determining corresponding pitch accent type, is proposed to process any unrestricted text input with Conditional Random Field (CRF) trained models. The evaluation results show that the new approach outperforms the conventional, single stage approach.
Keywords
discrete cosine transforms; speech synthesis; CRF model; DCT features; conditional random field; pitch accent detection; pitch accent modeling; pitch accent pattern; pitch accent prediction; text-to-speech synthesis; vector quantized DCT coefficient patterns; Accuracy; Discrete cosine transforms; Hidden Markov models; Predictive models; Speech; Stress; Training; CRF; DCT; F0 contour; LBG; Pitch accent; Prosody detection and prediction; ToBI;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location
Kowloon
Print_ISBN
978-1-4673-2506-6
Electronic_ISBN
978-1-4673-2505-9
Type
conf
DOI
10.1109/ISCSLP.2012.6423504
Filename
6423504
Link To Document