Title :
Time domain vocal tract length normalization
Author :
Sundermann, D. ; Bonafonte, Antonio ; Ney, Hermann ; Höge, Harald
Author_Institution :
Dept. of Signal Theor. & Commun., Univ. Politecnica de Catalunya, Barcelona, Spain
Abstract :
Recently, the speaker normalization technique VTLN (vocal tract length normalization), known from speech recognition, was applied to voice conversion. So far, VTLN has been performed in frequency domain. However, to accelerate the conversion process, it is helpful to apply VTLN directly to the time frames of a speech signal. In this paper, we propose a technique which directly manipulates the time signal. By means of subjective tests, it is shown that the performance of voice conversion techniques based on frequency domain and time domain VTLN are equivalent in terms of speech quality, while the latter requires about 20 times less processing time.
Keywords :
signal processing; speaker recognition; speech synthesis; time-frequency analysis; VTLN; frequency domain; speaker normalization technique; speech recognition; speech signal; subjective test; time domain; vocal tract length normalization; voice conversion technique; Character recognition; Computer science; Frequency domain analysis; Loudspeakers; Signal processing; Speech analysis; Speech processing; Speech recognition; Speech synthesis; Testing;
Conference_Titel :
Signal Processing and Information Technology, 2004. Proceedings of the Fourth IEEE International Symposium on
Print_ISBN :
0-7803-8689-2
DOI :
10.1109/ISSPIT.2004.1433719