Title :
Detection and emphatic realization of contrastive word pairs for expressive text-to-speech synthesis
Author :
Chunrong Li ; Zhiyong Wu ; Fanbo Meng ; Meng, Hsiang-Yun ; Lianhong Cai
Author_Institution :
Tsinghua-CUHK Joint Res. Center for Media Sci., Technol. & Syst., Tsinghua Univ., Shenzhen, China
Abstract :
This paper addresses the problem of automatic detection of contrastive word pairs and their acoustic realization in emphasis for expressive text-to-speech (TTS) synthesis in English. Support vector machines (SVMs) have been used to automatically detect contrastive word pairs from lexical features, syntactic dependencies and semantic relations. A much better performance is achieved by adding accent ratio and word identity features. Hidden Markov model (HMM) based speech synthesis is then used to generate emphatic speeches by putting emphasis on the detected contrastive word pairs. Subjective experiments show that most of the listeners consider putting emphasis on contrastive word pairs is more acceptable than on non-contrastive word pairs. This indicates the importance of the accurate detection of contrastive word pairs.
Keywords :
hidden Markov models; speech synthesis; support vector machines; English; HMM; SVM; TTS; acoustic realization; contrastive word pairs; emphatic realization; expressive text-to-speech synthesis; hidden Markov model; noncontrastive word pairs; support vector machines; word identity features; Feature extraction; Hidden Markov models; Semantics; Speech; Speech synthesis; Support vector machines; Syntactics; contrast; expressive text-to-speech (TTS) synthesis; hidden Markov model (HMM) based speech synthesis; support vector machines (SVMs);
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
DOI :
10.1109/ISCSLP.2012.6423493