DocumentCode
2064785
Title
Improving HMM Based Speech Synthesis by Reducing Over-Smoothing Problems
Author
Zhang, Meng ; Tao, Jianhua ; Jia, Huibin ; Wang, Xia
Author_Institution
Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., China
fYear
2008
fDate
16-19 Dec. 2008
Firstpage
1
Lastpage
4
Abstract
Although hidden Markov model based speech synthesis has been proved to have good performance, there are still some factors which degrade the quality of synthesized speech: vocoder, model accuracy and over-smoothing. This paper analyzes these factors separately. Modifications for removing different factors are proposed. Experimental results show that over-smoothing in frequency domain mainly affect the quality of synthesized speech whereas over-smoothing in time domain can nearly be ignored. Time domain over-smoothing is generally caused by model structure accuracy problem and frequency domain over- smoothing is caused by training algorithm accuracy problem. Currently used model structure is capable of representing speech without quality degradation. ML-estimation based parameter training algorithm causes distortion of perception in speech synthesis. Modification for improving parameter training algorithm is more likely to improve the synthesizing performance.
Keywords
frequency-domain analysis; hidden Markov models; maximum likelihood estimation; speech synthesis; time-domain analysis; vocoders; HMM based speech synthesis; ML-estimation based parameter training algorithm; frequency domain oversmoothing; hidden Markov model; over-smoothing problem reduction; time domain over-smoothing; vocoder; Algorithm design and analysis; Degradation; Frequency domain analysis; Hidden Markov models; High temperature superconductors; Laboratories; Pattern recognition; Speech analysis; Speech synthesis; Vocoders;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on
Conference_Location
Kunming
Print_ISBN
978-1-4244-2942-4
Electronic_ISBN
978-1-4244-2943-1
Type
conf
DOI
10.1109/CHINSL.2008.ECP.16
Filename
4730270
Link To Document