مرکز منطقه ای اطلاع رساني علوم و فناوري - State mapping for cross-language speaker adaptation in TTS

DocumentCode :

3529032

Title :

State mapping for cross-language speaker adaptation in TTS

Author :

Chen, Yi-Ning ; Jiao, Yang ; Qian, Yao ; Soong, Frank K.

Author_Institution :

Microsoft Res. Asia, Beijing

fYear :

2009

fDate :

19-24 April 2009

Firstpage :

4273

Lastpage :

4276

Abstract :

Cross-language speaker adaptation has many interesting applications, e.g. speech-to-speech translation. However, in cross-language speaker adaptation, a common phoneme set, assumed to be used by different speakers of the same language, does not exist any longer. Instead, a nearest neighbor based phoneme mapping from one language to the other has been adopted. In this study, we used our recently proposed sub-phonemic HMM state mapping for cross-language adaptations. The sub-phonemic HMM states, due to their phonetic segment nature, tend to be more sharable across different languages than phonemes. Kullback-Leibler divergence, an information-theoretic measure, is chosen here to measure the similarity between given states in different languages. Experimental results show that new state mapping outperforms the phoneme mapping baseline system in terms of three objective measures: log spectral distance, F0 adaptation error and F0 correlations. In comparing with intra-language adaptation, the cross-language result of the new algorithm is also fairly decent.

Keywords :

acoustic signal processing; hidden Markov models; information theory; natural languages; speech processing; speech recognition; speech synthesis; HMM-based speech synthesis; Kullback-Leibler divergence; TTS system; acoustic-phonetic event; common phoneme set; cross-language speaker adaptation; information-theoretic measure; speech recognition; subphonemic HMM state mapping; text-to-speech system; Acoustic measurements; Asia; Flowcharts; Hidden Markov models; Loudspeakers; Natural languages; Nearest neighbor searches; Speech processing; Speech recognition; Speech synthesis; Cross language; HMM-based TTS; Kullback-Leibler divergence; Speaker adaptation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location :

Taipei

ISSN :

1520-6149

Print_ISBN :

978-1-4244-2353-8

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2009.4960573

Filename :

4960573

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3529032