مرکز منطقه ای اطلاع رساني علوم و فناوري - Improving HMM Based Speech Synthesis by Reducing Over-Smoothing Problems

DocumentCode :

2064785

Title :

Improving HMM Based Speech Synthesis by Reducing Over-Smoothing Problems

Author :

Zhang, Meng ; Tao, Jianhua ; Jia, Huibin ; Wang, Xia

Author_Institution :

Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., China

fYear :

2008

fDate :

16-19 Dec. 2008

Firstpage :

Lastpage :

Abstract :

Although hidden Markov model based speech synthesis has been proved to have good performance, there are still some factors which degrade the quality of synthesized speech: vocoder, model accuracy and over-smoothing. This paper analyzes these factors separately. Modifications for removing different factors are proposed. Experimental results show that over-smoothing in frequency domain mainly affect the quality of synthesized speech whereas over-smoothing in time domain can nearly be ignored. Time domain over-smoothing is generally caused by model structure accuracy problem and frequency domain over- smoothing is caused by training algorithm accuracy problem. Currently used model structure is capable of representing speech without quality degradation. ML-estimation based parameter training algorithm causes distortion of perception in speech synthesis. Modification for improving parameter training algorithm is more likely to improve the synthesizing performance.

Keywords :

frequency-domain analysis; hidden Markov models; maximum likelihood estimation; speech synthesis; time-domain analysis; vocoders; HMM based speech synthesis; ML-estimation based parameter training algorithm; frequency domain oversmoothing; hidden Markov model; over-smoothing problem reduction; time domain over-smoothing; vocoder; Algorithm design and analysis; Degradation; Frequency domain analysis; Hidden Markov models; High temperature superconductors; Laboratories; Pattern recognition; Speech analysis; Speech synthesis; Vocoders;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on

Conference_Location :

Kunming

Print_ISBN :

978-1-4244-2942-4

Electronic_ISBN :

978-1-4244-2943-1

Type :

conf

DOI :

10.1109/CHINSL.2008.ECP.16

Filename :

4730270

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2064785