Title :
Improving HMM Based Speech Synthesis by Reducing Over-Smoothing Problems
Author :
Zhang, Meng ; Tao, Jianhua ; Jia, Huibin ; Wang, Xia
Author_Institution :
Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., China
Abstract :
Although hidden Markov model based speech synthesis has been proved to have good performance, there are still some factors which degrade the quality of synthesized speech: vocoder, model accuracy and over-smoothing. This paper analyzes these factors separately. Modifications for removing different factors are proposed. Experimental results show that over-smoothing in frequency domain mainly affect the quality of synthesized speech whereas over-smoothing in time domain can nearly be ignored. Time domain over-smoothing is generally caused by model structure accuracy problem and frequency domain over- smoothing is caused by training algorithm accuracy problem. Currently used model structure is capable of representing speech without quality degradation. ML-estimation based parameter training algorithm causes distortion of perception in speech synthesis. Modification for improving parameter training algorithm is more likely to improve the synthesizing performance.
Keywords :
frequency-domain analysis; hidden Markov models; maximum likelihood estimation; speech synthesis; time-domain analysis; vocoders; HMM based speech synthesis; ML-estimation based parameter training algorithm; frequency domain oversmoothing; hidden Markov model; over-smoothing problem reduction; time domain over-smoothing; vocoder; Algorithm design and analysis; Degradation; Frequency domain analysis; Hidden Markov models; High temperature superconductors; Laboratories; Pattern recognition; Speech analysis; Speech synthesis; Vocoders;
Conference_Titel :
Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2942-4
Electronic_ISBN :
978-1-4244-2943-1
DOI :
10.1109/CHINSL.2008.ECP.16