Title :
A superpositional model applied to F0 parameterization using DCT for text-to-speech synthesis
Author :
Stan, Adriana ; Giurgiu, Mircea
Author_Institution :
Commun. Dept., Tech. Univ. of Cluj-Napoca, Cluj-Napoca, Romania
Abstract :
This paper addresses the idea of the superpositional model based on the DCT (Discrete Cosine Transform) parameterization of the F0 contours. We examine the capacity of the DCT coefficients to estimate the fast variations in the F0 contour at syllable level and also the overall trend of the phrase level. The method determines the coefficients at syllable level, based on the subtraction of the estimated phrase level contour from the original one; thus considering that the syllable has an additive prosodic effect over the phrase level. We also compare the use of 3 different decision and regression tree algorithms for DCT coefficients clustering and prediction. Additional features are selected based on a greedy stepwise without backtracking feature selection method. The results support the proposed method through low average square errors and little or no perceivable errors in the synthesized speech.
Keywords :
decision trees; discrete cosine transforms; regression analysis; speech synthesis; DCT; decision tree; discrete cosine transform; regression tree algorithm; superpositional model; text to speech synthesis; Decision trees; Discrete cosine transforms; Feature extraction; Prediction algorithms; Speech; Stress; Training; DCT; F0 modelling; pitch; prosody;
Conference_Titel :
Speech Technology and Human-Computer Dialogue (SpeD), 2011 6th Conference on
Conference_Location :
Brasov
Print_ISBN :
978-1-4577-0440-6
DOI :
10.1109/SPED.2011.5940734