DocumentCode :
2281926
Title :
Voice quality conversion in TD-PSOLA speech synthesis
Author :
Sun, Xuejing
Author_Institution :
Dept. of Commun. Sci. & Disorders, Northwestern Univ., Evanston, IL, USA
Volume :
2
fYear :
2000
fDate :
2000
Abstract :
The capability of producing different voice qualities is highly desirable in modern speech synthesis systems. Diphone based synthesizers using TD-PSOLA can generate high quality synthetic speech. However, one of the drawbacks of such systems in comparison to the formant synthesizer or the LPC synthesizer is its inflexibility in voice quality conversion (VQC). In this paper, the author presents a VQC method for the TD-PSOLA synthesis system. For vocal fry, the ST-signals are multiplied by a Kaiser window with alternate magnitude; for breathy voice, the ST-signals are first convolved with a one-pole filter, and then combined with shaped noise signals, and finally multiplied by a Hanning window. All the windowed ST-signals are then overlap-added as in standard TD-PSOLA. The perceptual evaluation test shows that this method can generate the desired voice quality successfully
Keywords :
convolution; speech synthesis; time-domain synthesis; Hanning window; Kaiser window; ST-signals; TD-PSOLA synthesis system; VQC method; breathy voice; convolution; diphone based synthesizers; high quality synthetic speech; one-pole filter; perceptual evaluation test; shaped noise signals; speech synthesis; time domain pitch synchronous overlap-add system; vocal fry; voice quality conversion; Acoustics; Control system synthesis; Filters; Laboratories; Linear predictive coding; Modems; Signal synthesis; Speech synthesis; Sun; Synthesizers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
ISSN :
1520-6149
Print_ISBN :
0-7803-6293-4
Type :
conf
DOI :
10.1109/ICASSP.2000.859119
Filename :
859119
Link To Document :
بازگشت