Title :
Improving the performance of MGM-based voice conversion by preparing training data method
Author :
Zuo, Guo-Yu ; Liu, Wen-Ju ; Ruan, Xiao-gang
Author_Institution :
Inst. of Autom., Acad. Sinica, Beijing, China
Abstract :
This paper proposes an approach to improve both the target speaker´s individuality and the quality of the converted speech by preparing the training data. In mixture Gaussian spectral mapping (MGM) based voice conversion, spectral feature representations are analyzed to obtain the right feature associations between the source and target characteristics. A voiced and unvoiced (V/U-V) decision scheme for time-alignment is provided to obtain the right data for training the MGM function while removing the misaligned data. Experiments are conducted in terms of the applications of spectral representation methods, and V/UV decision strategies, to the MGM functions. When linear predictive cepstral coefficients (LPCC) are used for time-alignment and the V/UV decisions are adopted for removing bad data, results show that the conversion function can get a better accuracy and the proposed method can effectively improve the overall performance of voice conversion.
Keywords :
Gaussian distribution; cepstral analysis; signal representation; speech processing; LPCC; MGM-based voice conversion; conversion function accuracy; converted speech quality; linear predictive cepstral coefficients; mixture Gaussian spectral mapping voice conversion; prepared training data method; source/target feature associations; speaker individuality; spectral feature representations; voiced/unvoiced time-alignment decision scheme; Automation; Cepstral analysis; Control engineering; Covariance matrix; Laboratories; Loudspeakers; Pattern recognition; Speech analysis; Telephony; Training data;
Conference_Titel :
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN :
0-7803-8678-7
DOI :
10.1109/CHINSL.2004.1409616