DocumentCode :
730770
Title :
HMM-based emphatic speech synthesis for corrective feedback in computer-aided pronunciation training
Author :
Yishuang Ning ; Zhiyong Wu ; Jia Jia ; Fanbo Meng ; Meng, Helen ; Lianhong Cai
Author_Institution :
Shenzhen Key Lab. of Inf. Sci. & Technol., Tsinghua Univ., Shenzhen, China
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
4934
Lastpage :
4938
Abstract :
This paper investigates the incorporation of hidden Markov model (HMM) based emphatic speech synthesis for audio exaggeration into an audio-visual speech synthesis framework for the corrective feedback in computer-aided pronunciation training (CAPT). To improve the voice quality of the synthetic emphatic speech, this paper proposes a new method for HMM training. In this method, the contextual questions for decision tree building are extended by considering the emphasis-related information. HMMs are then trained using a small scale emphatic corpus together with a large scale neutral corpus. The emphatic corpus is used to ensure the quality of the emphatic speech segments whereas the neutral corpus is to further improve the quality of both the non-emphatic speech segments and the emphatic ones. Finally, emphatic speech synthesis is achieved by extending the Flite+hts_engine. Experimental results show that our method can synthesize emphatic speech with high quality and make the feedback more discriminatively perceptible.
Keywords :
audio-visual systems; decision trees; feedback; hidden Markov models; speech synthesis; CAPT; Flite+hts_engine; HMM-based emphatic speech synthesis; audio exaggeration; audio-visual speech synthesis framework; computer-aided pronunciation training; corrective feedback; decision tree; emphatic speech segmentation quality; hidden Markov model; large scale neutral corpus; small scale emphatic corpus; voice quality improvement; Acoustics; Context; Context modeling; Hidden Markov models; Speech; Speech synthesis; Training; computer-aided pronunciation training (CAPT); emphatic speech synthesis; hidden Markov model (HMM);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178909
Filename :
7178909
Link To Document :
بازگشت