DocumentCode :
3431306
Title :
A multi-level representation of f0 using the continuous wavelet transform and the Discrete Cosine Transform
Author :
Ribeiro, Manuel Sam ; Clark, Robert A. J.
Author_Institution :
Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
4909
Lastpage :
4913
Abstract :
We propose a representation of f0 using the Continuous Wavelet Transform (CWT) and the Discrete Cosine Transform (DCT). The CWT decomposes the signal into various scales of selected frequencies, while the DCT compactly represents complex contours as a weighted sum of cosine functions. The proposed approach has the advantage of combining signal decomposition and higher-level representations, thus modeling low-frequencies at higher levels and high-frequencies at lower-levels. Objective results indicate that this representation improves f0 prediction over traditional short-term approaches. Subjective results show that improvements are seen over the typical MSD-HMM and are comparable to the recently proposed CWT-HMM, while using less parameters. These results are discussed and future lines of research are proposed.
Keywords :
discrete cosine transforms; speech synthesis; CWT; DCT; continuous wavelet transform; cosine functions; discrete cosine transform; higher level representations; multilevel representation; selected frequencies; signal decomposition; statistical parametric speech synthesis techniques; Continuous wavelet transforms; Discrete cosine transforms; Hidden Markov models; Speech; Speech synthesis; HMM-based synthesis; continuous wavelet transform; discrete cosine transform; f0 modeling; prosody;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178904
Filename :
7178904
Link To Document :
بازگشت