DocumentCode
2017308
Title
GMM-based voice conversion with explicit modelling on feature transform
Author
Chen, Ling-Hui ; Ling, Zhen-Hua ; Guo, Wu ; Dai, Li-Rong
Author_Institution
iFLYTEK Speech Lab., Univ. of Sci. & Technol. of China, Hefei, China
fYear
2010
fDate
Nov. 29 2010-Dec. 3 2010
Firstpage
364
Lastpage
368
Abstract
In this paper, we propose a Gaussian mixture model (GMM) based voice conversion method using explicit feature transform models. A piecewise linear transform with stochastic bias is adopted to present the relationship between the spectral features of source and target speakers. This explicit transformations are integrated into the training of GMM for the joint probability density of source and target features. The maximum likelihood parameter generation algorithm with dynamic features is used to generate the converted spectral trajectories. Our method can model the cross-dimension correlations for the joint density GMM (JDGMM), while significantly decreasing computation cost comparing with JDGMM with full covariance. Experimental results show that the proposed method outperformed the conventional GMM-based method in cross-gender voice conversion.
Keywords
maximum likelihood estimation; piecewise linear techniques; probability; speech synthesis; GMM based voice conversion; Gaussian mixture model; computation cost; converted spectral trajectory; cross dimension correlation; dynamic feature; explicit modelling; feature transform; joint probability density; maximum likelihood parameter generation algorithm; piecewise linear transform; source speaker; spectral feature; stochastic bias; target speaker; Computational modeling; Covariance matrix; Heuristic algorithms; Hidden Markov models; Speech; Training; Transforms;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location
Tainan
Print_ISBN
978-1-4244-6244-5
Type
conf
DOI
10.1109/ISCSLP.2010.5684869
Filename
5684869
Link To Document