DocumentCode :
1687234
Title :
Training a supra-segmental parametric F0 model without interpolating F0
Author :
Latorre, Javier ; Gales, Mark J.F. ; Knill, Kate ; Akamine, Masami
Author_Institution :
Cambridge Res. Lab., Toshiba Res. Eur. Ltd., Cambridge, UK
fYear :
2013
Firstpage :
6880
Lastpage :
6884
Abstract :
Combining multiple intonation models at different linguistic levels is an effective way to improve the naturalness of the predicted F0. In many of these approaches, the intonation models for suprasegmental levels are based on a parametrization of the log-F0 contours over the units of that level. However, many of these parametrisations are not stable when applied to discontinuous signals. Therefore, the F0 signal has to be interpolated. These interpolated values introduce a distortion in the coefficients that degrades the quality of the model. This paper proposes two methods that eliminate the need for such interpolation, one based on regularization and the other on factor analysis. Subjective evaluations show that, for a Discrete-cosine-transform (DCT) syllable-level model, both approaches result in a significant improvement w.r.t. a baseline using interpolated F0. The approach based on regularization yields the best results.
Keywords :
discrete cosine transforms; interpolation; speech processing; discrete-cosine-transform syllable-level model; interpolation; intonation models; supra-segmental parametric F0 model; Analytical models; Computational modeling; Discrete cosine transforms; Hidden Markov models; Interpolation; Mathematical model; Speech; F0 interpolation; factor analysis; intonation; regularization; speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638995
Filename :
6638995
Link To Document :
بازگشت