A hybrid approach to synthesize high quality Cantonese speech

Author

Min, Chu ; Ching, P.C.

Author_Institution

Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Shatin, Hong Kong

Volume

1

fYear

1998

fDate

12-15 May 1998

Firstpage

277

Abstract

Synthesizing high quality speech necessitates an intelligent modification algorithm to adjust the important prosodic features of the pre-stored speech units to meet the desired output requirements, such as smoothness, naturalness and pleasantness. The time domain pitch-synchronous overlap and add (TD-PSOLA) scheme is a simple but effective method of varying the pitch and time-scaling of speech and it can produce high quality synthetic output. However, when the prosodic pattern requires a drastic modification in the spectral content of the stored units, TD-PSOLA often generates speech with reverberant sound. This paper develops a new hybrid synthesis method based on TD-PSOLA and shape-invariant sinusoidal technique to alleviate the problem of reverberation. It is particularly useful for the generation of Cantonese speech, since it can cope with the rapidly changing of the pitch profile of Cantonese, which is a mono-syllabic and tonal language. The proposed method has been applied to construct a Cantonese synthesizer which is shown to be capable of producing high quality Cantonese speech without reverberation

Keywords

echo suppression; reverberation; spectral analysis; speech synthesis; time-domain synthesis; TD-PSOLA scheme; high quality Cantonese speech; hybrid approach; hybrid synthesis method; intelligent modification algorithm; mono-syllabic tonal language; naturalness; output requirements; pitch profile; pleasantness; pre-stored speech units; prosodic features; reverberant sound; shape-invariant sinusoidal technique; smoothness; spectral content; time domain pitch-synchronous overlap and add; time-scaling; Data mining; Humans; Natural languages; Phase distortion; Reverberation; Signal processing algorithms; Signal synthesis; Speech processing; Speech synthesis; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.674421

Filename

674421