DocumentCode
730770
Title
HMM-based emphatic speech synthesis for corrective feedback in computer-aided pronunciation training
Author
Yishuang Ning ; Zhiyong Wu ; Jia Jia ; Fanbo Meng ; Meng, Helen ; Lianhong Cai
Author_Institution
Shenzhen Key Lab. of Inf. Sci. & Technol., Tsinghua Univ., Shenzhen, China
fYear
2015
fDate
19-24 April 2015
Firstpage
4934
Lastpage
4938
Abstract
This paper investigates the incorporation of hidden Markov model (HMM) based emphatic speech synthesis for audio exaggeration into an audio-visual speech synthesis framework for the corrective feedback in computer-aided pronunciation training (CAPT). To improve the voice quality of the synthetic emphatic speech, this paper proposes a new method for HMM training. In this method, the contextual questions for decision tree building are extended by considering the emphasis-related information. HMMs are then trained using a small scale emphatic corpus together with a large scale neutral corpus. The emphatic corpus is used to ensure the quality of the emphatic speech segments whereas the neutral corpus is to further improve the quality of both the non-emphatic speech segments and the emphatic ones. Finally, emphatic speech synthesis is achieved by extending the Flite+hts_engine. Experimental results show that our method can synthesize emphatic speech with high quality and make the feedback more discriminatively perceptible.
Keywords
audio-visual systems; decision trees; feedback; hidden Markov models; speech synthesis; CAPT; Flite+hts_engine; HMM-based emphatic speech synthesis; audio exaggeration; audio-visual speech synthesis framework; computer-aided pronunciation training; corrective feedback; decision tree; emphatic speech segmentation quality; hidden Markov model; large scale neutral corpus; small scale emphatic corpus; voice quality improvement; Acoustics; Context; Context modeling; Hidden Markov models; Speech; Speech synthesis; Training; computer-aided pronunciation training (CAPT); emphatic speech synthesis; hidden Markov model (HMM);
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178909
Filename
7178909
Link To Document