DocumentCode :
3498324
Title :
Voice Conversion Without Parallel Speech Corpus Based on Mixtures of Linear Transform
Author :
Jian, Zhi-Hua ; Yang, Zhen
Author_Institution :
Inst. of Signal Process. & Transm., Nanjing Univ. of Posts & Telecommun., Nanjing
fYear :
2007
fDate :
21-25 Sept. 2007
Firstpage :
2825
Lastpage :
2828
Abstract :
This paper presents an algorithm for voice conversion based on mixtures of linear transform (Ms-LT) which avoids the need for parallel training data inherent in conventional approaches. In maximum likelihood framework, the EM algorithm is used to compute the parameters of the conversion function. And the chirp z-transform is utilized to enhance the averaged spectral envelop due to the linear weighting. The proposed voice conversion system is evaluated using both objective and subjective measures. The experimental results demonstrate that our approach is capable of effectively transforming speaker identity and can achieve comparable results of the conventional methods where a parallel corpus exists.
Keywords :
maximum likelihood estimation; speech processing; chirp z-transform; linear transform mixtures; maximum likelihood framework; voice conversion; Artificial neural networks; Chirp; Loudspeakers; Multimedia systems; Signal processing algorithms; Speech analysis; Speech processing; Speech synthesis; Telecommunication computing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Wireless Communications, Networking and Mobile Computing, 2007. WiCom 2007. International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-1311-9
Type :
conf
DOI :
10.1109/WICOM.2007.701
Filename :
4340476
Link To Document :
بازگشت