Title :
Tonal context labeling using quantized F0 symbols for improving tone correctness in average-voice-based speech synthesis
Author :
Chunwijitra, Vataya ; Nose, Takashi ; Kobayashi, Takao
Author_Institution :
Interdiscipl. Grad. Sch. of Sci. & Eng., Tokyo Inst. of Technol., Yokohama, Japan
Abstract :
This paper proposes a technique for improving tone correctness in Thai speech synthesis based on an average voice model trained with nonprofessional speech corpus. The proposed technique utilizes quantized F0 symbols as the tonal context in order to obtain an appropriate F0 model. With this technique, the prosodic context can be extracted from real speech directly and this leads to prevent the inconsistency between speech data and F0 labels generated from transcription, which affects the naturalness and tone correctness in synthetic speech. We examine two types of tonal context labeling using the quantized F0 symbols based on phone and sub-phone boundaries. Experimental results of both objective and subjective tests show that the proposed technique can improve not only the naturalness but also the tone correctness of synthetic speech under condition of using a small amount speech data of nonprofessional target speakers.
Keywords :
hidden Markov models; speech synthesis; average-voice-based speech synthesis; nonprofessional target speakers; quantized F0 symbol; tonal context labeling; voice model; Adaptation models; Context; Context modeling; Hidden Markov models; Labeling; Speech; Speech synthesis; F0 modeling; F0 quantization; HMM-based speech synthesis; average voice model;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947406