A system for voice conversion based on probabilistic classification and a harmonic plus noise model

Author

Styliano, Yannis ; Cappé, Olivier

Author_Institution

Res. Labs., AT&T Labs., Florham Park, NJ, USA

Volume

1

fYear

1998

fDate

12-15 May 1998

Firstpage

281

Abstract

Voice conversion is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). This paper describes a system for efficient voice conversion. A novel mapping function is presented which associates the acoustic space of the source speaker with the acoustic space of the target speaker. The proposed system is based on the use of a Gaussian mixture model, GMM, to model the acoustic space of a speaker and a pitch synchronous harmonic plus noise representation of the speech signal for prosodic modifications. The mapping function is a continuous parametric function which takes into account the probabilistic classification provided by the mixture model (GMM). Evaluation by objective tests showed that the proposed system was able to reduce the perceptual distance between the source and target speaker by 70%. Formal listening tests also showed that 97% of the converted speech was judged to be spoken from the target speaker while maintaining high speech quality

Keywords

Gaussian processes; harmonics; noise; signal representation; speech intelligibility; speech processing; speech synthesis; Gaussian mixture model; acoustic space; continuous parametric function; formal listening tests; harmonic plus noise model; high speech quality; interpreted telephony; low rate bit speech coding; mapping function; objective tests; perceptual distance reduction; pitch synchronous harmonics; probabilistic classification; probability; prosodic modifications; source speaker; speech signal representation; target speaker; text-to-speech synthesis; voice conversion system; Acoustic noise; Acoustic testing; Gaussian noise; Loudspeakers; Speech coding; Speech recognition; Speech synthesis; System testing; Telephony; Vector quantization;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.674422

Filename

674422