Title :
Modelling the global acoustic correlates of expressivity for Chinese text-to-speech synthesis
Author :
Hongwu Yang ; Meng, H.M. ; Zhiyong Wu ; Lianhong Cai
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
Abstract :
This paper proposed a novel approach for describing the expressive elements in dialog response messages for expressive text-to-speech synthesis. We adopt the three-dimensional PAD emotional model in describing expressivity based on response message content and its dialog state. In particular, we use the P (pleasure) and A (arousal) descriptors to describe expressivity at the local, prosodic-word level based on its semantics. We also use the D (dominance) descriptor to describe expressivity at the global, utterance level based on its dialog act. Our context of study is based on response messages of a spoken dialog system in the Hong Kong tourism domain. We also prepared contrastive (neutral versus expressive) recordings to aid identification of the acoustic correlates of expressivity at both local and global levels. We utilized the acoustic analysis of these contrastive recordings to establish a nonlinear model that can be used to modulate input neutral speech at both local and global levels to generate output expressive speech. This work focuses on the nonlinear relationship between the D (dominance) values and their acoustic correlates. Perceptual evaluation indicates that local modulation of input neutral speech produces over 73% utterances carry appropriate expressivity. The combined uses of both local and global modulations produce nearly 84% expressive utterances.
Keywords :
emotion recognition; interactive systems; natural language processing; speech synthesis; dialog act; dialog response messages; expressive Chinese text-to-speech synthesis; nonlinear model; prosodic-word level; spoken dialog system; three-dimensional PAD emotional model; Acoustic measurements; Acoustical engineering; Atherosclerosis; Computer science; Nonlinear acoustics; Research and development management; Space technology; Speech analysis; Speech synthesis; Systems engineering and theory;
Conference_Titel :
Spoken Language Technology Workshop, 2006. IEEE
Conference_Location :
Palm Beach
Print_ISBN :
1-4244-0872-5
DOI :
10.1109/SLT.2006.326837