DocumentCode :
1416859
Title :
Control of spectral dynamics in concatenative speech synthesis
Author :
Wouters, Johan ; Macon, Michael W.
Author_Institution :
Center for Spoken Language Understanding, Oregon Graduate Inst. of Sci. & Technol., Beaverton, OR, USA
Volume :
9
Issue :
1
fYear :
2001
fDate :
1/1/2001 12:00:00 AM
Firstpage :
30
Lastpage :
38
Abstract :
Current speech synthesis methods based on the concatenation of waveform units can produce highly intelligible speech capturing the identity of a particular speaker. However, the quality of concatenated speech often suffers from discontinuities between the acoustic units, due to contextual differences and variations in speaking style across the database. In this paper, we present methods to spectrally modify speech units in a concatenative synthesizer to correspond more closely to the acoustic transitions observed in natural speech. First, a technique called “unit fusion” is proposed to reduce spectral mismatch between units. In addition to concatenation units, a second, independent tier of units is selected that defines the desired spectral dynamics at concatenation points. Both unit tiers are “fused” to obtain natural transitions throughout the synthesized utterance. The unit fusion method is further extended to control the perceived degree of articulation of concatenated units. A signal processing technique based on sinusoidal modeling is also presented that enables high-quality resynthesis of units with a modified spectral shape
Keywords :
acoustic signal processing; spectral analysis; speech intelligibility; speech synthesis; acoustic transitions; acoustic units discontinuities; articulation control; concatenated speech quality; concatenation units; concatenative speech synthesis; concatenative synthesizer; database; high intelligible speech; high-quality resynthesis; modified spectral shape; natural speech; signal processing; sinusoidal modeling; speaking style variations; spectral dynamics control; spectral mismatch reduction; unit fusion; unit fusion method; waveform units; Concatenated codes; Databases; Loudspeakers; Natural languages; Runtime; Signal processing; Signal synthesis; Spectral shape; Speech synthesis; Synthesizers;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.890069
Filename :
890069
Link To Document :
بازگشت