DocumentCode :
3428212
Title :
Runtime and speech quality survey of a voice conversion method
Author :
Jokisch, Oliver ; Birhanu, Yitagessu ; Hoffmann, Raik
Author_Institution :
Inst. of Commun. Eng., Leipzig Univ. of Telecommun., Leipzig, Germany
fYear :
2013
fDate :
1-4 July 2013
Firstpage :
1690
Lastpage :
1694
Abstract :
Several methods for voice conversion have been established. The research aims at the characteristics of a target speaker and a near-to-natural speech quality. This contribution summarizes the listening experiments with four conversion methods including the assessment of speech quality, listening effort and similarity to the target voice. The subjective evaluation of similarity is checked by an instrumental distance measure based on logarithmic spectral distortion. Practical applications of voice conversion require an appropriate runtime performance and memory use. We select a conversion method based on VTLN to demonstrate the runtime and quality trade-off. In the case example, we survey the quality assessment depending on different training constellations with a varied data amount and training time. Furthermore, we discuss the runtime performance of the selected conversion method under typical operating conditions. The experiments cover the influence of system resources, setting of conversion parameters (warping factors) and different training constellations. The observed real-time factors of a non-optimized laboratory VC version are inappropriate for typical application scenarios.
Keywords :
quality management; resource allocation; spectral analysis; speech processing; VTLN-based conversion method; conversion parameters; instrumental distance measurement; logarithmic spectral distortion; near-to-natural speech quality; non-optimized laboratory VC version; quality assessment; real-time factors; runtime survey; speech quality survey; system resources; target speaker; target voice similarity; voice conversion method; Quality assessment; Real-time systems; Runtime; Signal processing algorithms; Speech; Standards; Training; MOS; VTLN; runtime performance; speech quality; voice conversion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
EUROCON, 2013 IEEE
Conference_Location :
Zagreb
Print_ISBN :
978-1-4673-2230-0
Type :
conf
DOI :
10.1109/EUROCON.2013.6625204
Filename :
6625204
Link To Document :
بازگشت