Title :
Periodicity ratio extraction for mixed excitation model of vocoder-based speech synthesis
Author :
Gao, Weixun ; Cao, Qiying
Author_Institution :
Sch. of Inf. Sci. & Technol., Donghua Univeristy, Shanghai, China
Abstract :
Statistical vocoder based speech synthesis system has a small footprint and a flexibility to change voice characteristics. However, the output synthesized speech sounds mechanic or a little bit buzzy comparing with natural human speech. Mixed excitation model instead of either a periodic impulse train or white noise is commonly used for low bit rate speech coding. In this paper, we extend it to statistical vocoder based speech synthesis. We also compare two methods: comb filter and normalized correlation coefficient, of extracting periodicity ratios for mixed excitation model. Excitation parameters are modeled by HMM in a slave manner, where the state boundaries are given by spectral and pitch models. Two corpora uttered by a male and a female speaker are used to evaluate mixed excitation model. The experimental results show the voice quality of synthesized speech with mixed excitation model can be significantly improved and the method of Comb filter for extracting periodicity ratios slightly outperform normalized correlation coefficient.
Keywords :
feature extraction; hidden Markov models; speech synthesis; vocoders; comb filter method; hidden Markov model; mixed excitation model; normalized correlation coefficient method; periodicity ratio extraction; statistical vocoder; vocoder-based speech synthesis; Bit rate; Computer science; Educational institutions; Hidden Markov models; Humans; Power harmonic filters; Speech coding; Speech synthesis; Vocoders; White noise;
Conference_Titel :
Microelectronics & Electronics, 2009. PrimeAsia 2009. Asia Pacific Conference on Postgraduate Research in
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-4668-1
Electronic_ISBN :
978-1-4244-4669-8
DOI :
10.1109/PRIMEASIA.2009.5397413