Title :
High quality text-to-speech synthesis: a comparison of four candidate algorithms
Author_Institution :
Faculte Polytech. de Mons, Belgium
Abstract :
We investigate the use of four candidate speech models in the context of high quality text-to-speech systems (HQ-TTS), address problems typically encountered by their prosody matching and segment concatenation modules, and compare their performances regarding: the segment database compression ratio they allow, the computational load of the related synthesis algorithms, as well as their intelligibility and subjective segmental quality. The models addressed are: the classical auto-regressive (LPC) one, the hybrid harmonic/stochastic (H/S) model proposed by Griffin and Lim (1988) and by Abrantes, Marques and Transcoso (1991), the `null´ model, as implemented by the time-domain pitch-synchronous overlap-add (TD-PSOLA) synthesis algorithm, and the multi-band re-synthesis pitch-synchronous overlap-add (MBR-PSOLA) model
Keywords :
autoregressive processes; linear predictive coding; speech coding; speech intelligibility; speech synthesis; stochastic processes; time-domain synthesis; MBR-PSOLA; TD-PSOLA; auto-regressive LPC model; computational load; hybrid harmonic/stochastic model; intelligibility; multi-band re-synthesis pitch-synchronous overlap-add; null model; prosody matching modules; segment concatenation modules; segment database compression ratio; speech models; subjective segmental quality; synthesis algorithms; text-to-speech synthesis; time-domain pitch-synchronous overlap-add; Context modeling; Databases; Frequency synthesizers; Linear predictive coding; Signal synthesis; Speech analysis; Speech enhancement; Speech synthesis; Testing; Time domain analysis;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location :
Adelaide, SA
Print_ISBN :
0-7803-1775-0
DOI :
10.1109/ICASSP.1994.389231