Title :
Coherent modification of pitch and energy for expressive prosody implantation
Author :
Sorin, Alexander ; Shechtman, Slava ; Pollet, Vincent
Author_Institution :
Speech Technol., IBM Res. - Haifa, Haifa, Israel
Abstract :
In expressive TTS and voice transformation systems, implantation of expressive prosody derived from external out-of-domain sources often leads to extreme pitch modification that compromises the naturalness of the synthesized speech. In this work we investigate and prove a hypothesis that the naturalness loss is in part attributed to a violation of a fundamental relationship between the instantaneous pitch frequency and instantaneous energy of a speech signal. We propose an enhancement for pitch modification where the instantaneous energy is modified coherently with the pitch frequency and demonstrate the potential of this method in a subjective listening evaluation. The proposed approach is complementary to and can be combined with spectrum shape transformation methods for achieving the maximal possible quality of pitch modification.
Keywords :
speech synthesis; voice equipment; energy coherent modification; expressive TTS; expressive prosody implantation; naturalness loss; pitch coherent modification; speech synthesis; voice transformation systems; Estimation; Harmonic analysis; Hidden Markov models; Noise; Prototypes; Shape; Speech; energy modification; energy modulation; expressive TTS; pitch modification; prosody modification;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178905