Title :
Label transform based cross-language speaker adaptation in bilingual (Mandarin-English) TTS
Author :
So, Yongjin ; Jia, Jia ; Wang, Yongxin ; Cai, Lianhong
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Abstract :
This paper studies the cross-language speaker adaptation for HMM-based speech synthesis. To solve the problem when the adaptation data and the main corpus are not in the same language, we proposed a label transform based cross-language speaker adaptation approach. In order to transform the phone sequence between English and Chinese, a new Mandarin-English phonetic alphabet - HCSIPA is designed. Then, in addition to the traditional Kullback-Leibler Divergence, a phoneme similarity measure: AMD, which take articulation difference into account, is proposed to get the similarity between phonemes. Finally, a perception-based phoneme mapping strategy is implemented to increase the mapping accuracy between Mandarin and English phonemes. The perceptual tests verify the rationality of our approach. The adapted speeches have high natural quality, and are judged as similar to the target speaker.
Keywords :
hidden Markov models; natural language processing; speaker recognition; speech synthesis; transforms; AMD; English-Chinese phone sequence; HCSIPA; HMM-based speech synthesis; Kullback-Leibler divergence; Mandarin-English TTS; Mandarin-English phonetic alphabet; bilingual TTS; cross-language speaker adaptation; label transform based cross-language speaker adaptation; perception-based phoneme mapping strategy; phoneme similarity measure; Accuracy; Adaptation models; Hidden Markov models; Speech; Speech synthesis; Transforms; Vectors;
Conference_Titel :
Audio, Language and Image Processing (ICALIP), 2012 International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0173-2
DOI :
10.1109/ICALIP.2012.6376754