Integration of rule-based formant synthesis and waveform concatenation: a hybrid approach to text-to-speech synthesis

Author

Hertz, Susan R.

Author_Institution

Dept. of Linguistics, Cornell Univ., Ithaca, NY, USA

fYear

2002

fDate

11-13 Sept. 2002

Firstpage

87

Lastpage

90

Abstract

This paper describes an approach to speech synthesis in which waveform fragments dynamically produced with a set of formant-based synthesis rules are concatenated with pre-stored natural speech waveform fragments to produce a synthetic utterance. While this hybrid approach was originally implemented as a tool for research into improved voice quality in formant-based synthesis, it has produced such good results that we now view it as a potentially viable and advantageous approach for a text-to-speech product. Possible advantages of the approach include smaller speech databases for waveform concatenation, enhancement of certain speech cues for sub-optimal listening environments, and improved and more efficient unit selection/production. In addition, the approach has already proven its utility as a tool for research and development in both concatenative and formant-based synthesis.

Keywords

knowledge based systems; speech enhancement; speech synthesis; efficient unit selection/production; rule-based formant synthesis; speech cue enhancement; speech database size; sub-optimal listening environments; synthetic utterance; text-to-speech synthesis; waveform concatenation; Concatenated codes; Databases; Degradation; Humans; Natural languages; Research and development; Speech enhancement; Speech synthesis; Splicing; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on

Print_ISBN

0-7803-7395-2

Type

conf

DOI

10.1109/WSS.2002.1224379

Filename

1224379