مرکز منطقه ای اطلاع رساني علوم و فناوري - Cross validation and Minimum Generation Error for improved model clustering in HMM-based TTS

DocumentCode :

3123762

Title :

Cross validation and Minimum Generation Error for improved model clustering in HMM-based TTS

Author :

Feng-Long Xie ; Yi-Jian Wu ; Soong, Frank K.

Author_Institution :

Microsoft Res. Asia, Beijing, China

fYear :

2012

fDate :

5-8 Dec. 2012

Firstpage :

Lastpage :

Abstract :

In HMM-based speech synthesis, context-dependent hidden Markov model (HMM) is widely used for its capability to synthesize highly intelligible and fairly smooth speech. However, to train HMMs of all possible contexts well is difficult, or even impossible, due to the intrinsic, insufficient training data coverage problem. As a result, thus trained models may over fit and their capability in predicting any unseen context in test is highly restricted. Recently cross-validation (CV) has been explored and applied to the decision tree-based clustering with the Maximum-Likelihood (ML) criterion and showed improved robustness in TTS synthesis. In this paper we generalize CV to decision tree clustering but with a different, Minimum Generation Error (MGE), criterion. Experimental results show that the generalization to MGE results in better TTS synthesis performance than that of the baseline systems.

Keywords :

decision trees; hidden Markov models; maximum likelihood estimation; pattern clustering; speech synthesis; HMM-based TTS; HMM-based speech synthesis; MGE; context-dependent hidden Markov model; cross validation; decision tree-based clustering; maximum-likelihood criterion; minimum generation error; Context; Decision trees; Hidden Markov models; Speech; Speech synthesis; Training; Training data; HMM-based synthesis; context clustering; cross validation; minimum generation error;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on

Conference_Location :

Kowloon

Print_ISBN :

978-1-4673-2506-6

Electronic_ISBN :

978-1-4673-2505-9

Type :

conf

DOI :

10.1109/ISCSLP.2012.6423459

Filename :

6423459

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3123762