Title :
Combining Gaussian Mixture Model with Global Variance Term to Improve the Quality of an HMM-Based Polyglot Speech Synthesizer
Author :
Latorre, Javier ; Iwano, K. ; Furui, S.
Author_Institution :
Dept. of Comput. Sci., Tokyo Inst. of Technol., Japan
Abstract :
This paper proposes a new method to calculate the cepstral coefficients for an HMM-based synthesizer. It consists of a direct maximization of the log-likelihood function of a Gaussian mixture model using a gradient ascent algorithm. The method permits to integrate efficiently the global variance factor with a Gaussian mixture acoustic model. The perceptual experiments confirmed that these two factors produce significant improvements on the speech quality, which are independent from each other. By using the proposed method, it is possible to get the benefits of both factors. This paper also proposes a 2-class model for the global variance that discriminates between consonants and vowels. Such 2-class global variance model produces more stable cepstral coefficients than the single-class one.
Keywords :
Gaussian processes; gradient methods; hidden Markov models; speech synthesis; Gaussian mixture acoustic model; HMM-based polyglot speech synthesizer; cepstral coefficients; direct log-likelihood function maximization; global variance term; gradient ascent algorithm; speech quality; Cepstral analysis; Computer science; Density functional theory; Electronic mail; Hidden Markov models; Probability; Speech synthesis; Synthesizers; Vocoders; Gaussian mixture; Global Variance; HMM-based speech synthesis; polyglot;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
DOI :
10.1109/ICASSP.2007.367301