DocumentCode
3343846
Title
Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction
Author
Kain, Alexander ; Macon, Michael W.
Author_Institution
Center for Spoken Language Understanding, Oregon Graduate Inst., Beaverton, OR, USA
Volume
2
fYear
2001
fDate
2001
Firstpage
813
Abstract
The purpose of a voice conversion (VC) system is to change the perceived speaker identity of a speech signal. We propose an algorithm based on converting the LPC spectrum and predicting the residual as a function of the target envelope parameters. We conduct listening tests based on speaker discrimination of same/difference pairs to measure the accuracy by which the converted voices match the desired target voices. To establish the level of human performance as a baseline, we first measure the ability of listeners to discriminate between original speech utterances under three conditions: normal, fundamental frequency and duration normalized, and LPC coded. Additionally, the spectral parameter conversion function is tested in isolation by listening to source, target, and converted speakers as LPC coded speech. The results show that the speaker identity of speech whose LPC spectrum has been converted can be recognized as the target speaker with the same level of performance as discriminating between LPC coded speech. However, the level of discrimination of converted utterances produced by the full VC system is significantly below that of speaker discrimination of natural speech
Keywords
linear predictive coding; speech synthesis; LPC coded speech; LPC spectrum; human performance; listening tests; natural speech; original speech utterances; perceived speaker identity; residual prediction; speaker discrimination; spectral envelope mapping; spectral parameter conversion function; speech signal; voice conversion algorithm; Algorithm design and analysis; Databases; Humans; Linear predictive coding; Loudspeakers; Natural languages; Speech recognition; Target recognition; Testing; Virtual colonoscopy;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location
Salt Lake City, UT
ISSN
1520-6149
Print_ISBN
0-7803-7041-4
Type
conf
DOI
10.1109/ICASSP.2001.941039
Filename
941039
Link To Document