DocumentCode :
1933094
Title :
Compression of prosody for speech modification in synthesis
Author :
Ansari, Rashid ; Kurek, Wojciech
Author_Institution :
Dept. of Electr. Eng. * Comput. Sci., Illinois Univ., Chicago, IL, USA
Volume :
1
fYear :
1997
fDate :
2-5 Nov. 1997
Firstpage :
219
Abstract :
Methods of compressing the prosodic information about a speaker´s delivery-pitch, duration, and amplitude - are described. The objective is to use available or extracted knowledge of the spoken text along with the prosodic information to synthesize speech from a suitable inventory of stored basic units. Techniques for compressing pitch and amplitude of speech units using transform coding are investigated. Discrete cosine and sine transforms are found to be effective in compressing pitch and amplitude information respectively. In order to generate speech with these coded pitch and amplitude contours, the prosodic features of speech units stored in an inventory are modified using a method that was proposed to perform speech modification for concatenative synthesis. In this method, the stored speech unit is processed with a suitably shaped time-varying prefilter, whose parameters are chosen to have a low sensitivity to pitch changes. The filtered signal is modified according to the required change in its prosodic structure, and then applied to the inverse of the prefilter. Examples of application of the proposed representation and modification of prosody to a variety of speech units are presented.
Keywords :
data compression; discrete cosine transforms; filtering theory; inverse problems; signal representation; speech coding; speech synthesis; time-varying filters; transform coding; DCT; coded amplitude contour; coded pitch contour; concatenative synthesis; discrete cosine transform; discrete sine transform; duration; extracted knowledge; filtered signal; inverse prefilter; prosodic information; prosody compression; prosody representation; speech communication; speech modification; speech synthesis; speech units; spoken text; stored prosody; time-varying prefilter; transform coding; Data mining; Databases; Discrete transforms; Electronic mail; Signal synthesis; Speech coding; Speech processing; Speech recognition; Speech synthesis; Transform coding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signals, Systems & Computers, 1997. Conference Record of the Thirty-First Asilomar Conference on
Conference_Location :
Pacific Grove, CA, USA
ISSN :
1058-6393
Print_ISBN :
0-8186-8316-3
Type :
conf
DOI :
10.1109/ACSSC.1997.680166
Filename :
680166
Link To Document :
بازگشت