DocumentCode :
1431743
Title :
Wavelet analysis used in text-to-speech synthesis
Author :
Kobayashi, Mei ; Sakamoto, Masaharu ; Saito, Takashi ; Hashimoto, Yasuhide ; Nishimura, Masafumi ; Suzuki, Kazuhiro
Author_Institution :
Res. Lab., IBM Japan Ltd., Tokyo, Japan
Volume :
45
Issue :
8
fYear :
1998
fDate :
8/1/1998 12:00:00 AM
Firstpage :
1125
Lastpage :
1129
Abstract :
This brief describes the use of wavelet analysis in the development of a Japanese text-to-speech (TTS) system for personal computers. The quality of synthesized speech is one of the most important features of any TTS system. Synthesis methods which are based on manipulation of the speech signal spectrum (e.g,, linear predictive coding synthesis and formant synthesis) produce comprehensible but unnatural sounding output. The lack of naturalness commonly associated with these methods results from the use of oversimplified speech models, small synthesis unit inventories, and poor handling of text parsing for prosody control. We developed four new technologies to overcome these difficulties and improve the quality of output from TTS systems: accurate pitch mark determination by wavelet analysis, speech waveform generation using a modified time domain pitch synchronous overlap-add method, speech synthesis unit selection using a context dependent clustering method, and efficient prosody control using a 3-phrase parser. All four technologies will be described; however, those which rely on wavelet techniques will be emphasized
Keywords :
linear predictive coding; speech coding; speech synthesis; wavelet transforms; Japanese text-to-speech synthesis; context dependent clustering method; formant synthesis; linear predictive coding synthesis; oversimplified speech models; personal computers; pitch mark determination; prosody control; speech signal spectrum manipulation; speech waveform generation; synthesis unit inventories; text parsing; three-phrase parser; time domain pitch synchronous overlap-add method; unnatural sounding output; wavelet analysis; Control system synthesis; Linear predictive coding; Microcomputers; Signal synthesis; Speech analysis; Speech coding; Speech synthesis; Synchronous generators; Wavelet analysis; Wavelet domain;
fLanguage :
English
Journal_Title :
Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1057-7130
Type :
jour
DOI :
10.1109/82.718823
Filename :
718823
Link To Document :
بازگشت