DocumentCode
3341337
Title
A system for voice conversion based on probabilistic classification and a harmonic plus noise model
Author
Styliano, Yannis ; Cappé, Olivier
Author_Institution
Res. Labs., AT&T Labs., Florham Park, NJ, USA
Volume
1
fYear
1998
fDate
12-15 May 1998
Firstpage
281
Abstract
Voice conversion is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). This paper describes a system for efficient voice conversion. A novel mapping function is presented which associates the acoustic space of the source speaker with the acoustic space of the target speaker. The proposed system is based on the use of a Gaussian mixture model, GMM, to model the acoustic space of a speaker and a pitch synchronous harmonic plus noise representation of the speech signal for prosodic modifications. The mapping function is a continuous parametric function which takes into account the probabilistic classification provided by the mixture model (GMM). Evaluation by objective tests showed that the proposed system was able to reduce the perceptual distance between the source and target speaker by 70%. Formal listening tests also showed that 97% of the converted speech was judged to be spoken from the target speaker while maintaining high speech quality
Keywords
Gaussian processes; harmonics; noise; signal representation; speech intelligibility; speech processing; speech synthesis; Gaussian mixture model; acoustic space; continuous parametric function; formal listening tests; harmonic plus noise model; high speech quality; interpreted telephony; low rate bit speech coding; mapping function; objective tests; perceptual distance reduction; pitch synchronous harmonics; probabilistic classification; probability; prosodic modifications; source speaker; speech signal representation; target speaker; text-to-speech synthesis; voice conversion system; Acoustic noise; Acoustic testing; Gaussian noise; Loudspeakers; Speech coding; Speech recognition; Speech synthesis; System testing; Telephony; Vector quantization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location
Seattle, WA
ISSN
1520-6149
Print_ISBN
0-7803-4428-6
Type
conf
DOI
10.1109/ICASSP.1998.674422
Filename
674422
Link To Document