Title :
Spectral modification for context-free voice conversion using MELP speech coding framework
Author :
Salor, Özgül ; Demirekler, Mübeccel
Author_Institution :
Dept. of Electr. & Electron. Eng., Middle East Tech. Univ., Ankara, Turkey
Abstract :
In this work, we have focused on spectral modification of speech for voice conversion from one speaker to another. The MELP (mixed excitation linear prediction) speech coding algorithm has been used as a speech analysis and synthesis framework. Using a 230-sentence triphone balanced database of the two speakers, a mapping between the 4-stage vector quantization indexes for line spectral frequencies (LSFs) of the two speakers have been obtained. This mapping provides a context-free speech conversion for spectral properties of the speakers. Two different methods have been proposed to obtain the LSF mapping. The first method determines the corresponding source and the target LSF codeword indexes, while the second method finds a new LSF codebook for the target speaker. After the spectral modification, pitch modification is applied to the source speaker´s residual to approximate the target speaker´s pitch range and then the modified filter is driven by the modified residual signal. Subjective ABX listening tests have been carried out and the correct speaker perception rate has been obtained as 80% and 77% for the first and the second spectral conversion methods respectively. For future work, we are planning to integrate our previous work on LPC filter and residual relationship analysis to increase the correct speaker perception rate.
Keywords :
frequency convertors; linear predictive coding; speech coding; speech synthesis; vector quantisation; LPC filter; LSF codebook; LSF codeword indexes; LSF mapping; MELP speech coding; context-free voice conversion; line spectral frequencies; mixed excitation linear prediction; multiple sentence triphone balanced database; pitch modification; residual relationship analysis; speaker perception rate; speaker voice conversion; spectral conversion methods; spectral mapping; speech analysis; speech spectral modification; speech synthesis; target speaker pitch range; vector quantization indexes; Databases; Filters; Frequency; Indexes; Prediction algorithms; Speech analysis; Speech coding; Speech synthesis; Testing; Vector quantization;
Conference_Titel :
Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on
Print_ISBN :
0-7803-8687-6
DOI :
10.1109/ISIMP.2004.1434063