DocumentCode :
3340902
Title :
TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
Author :
Syrdal, Ann ; Stylianou, Yannis ; Garrison, Laurie ; Conkie, Alistair ; Schroeter, Juergen
Author_Institution :
Res. Labs., AT&T Labs., Florham Park, NJ, USA
Volume :
1
fYear :
1998
fDate :
12-15 May 1998
Firstpage :
273
Abstract :
In an effort to select a speech representation for our next generation concatenative text-to-speech synthesizer, the use of two candidates is investigated; TD-PSOLA and the harmonic plus noise model, HNM. A formal listening test has been conducted and the two candidates have been rated regarding intelligibility, naturalness and pleasantness. Ability for database compression and computational load is also discussed. The results show that HNM consistently outperforms TD-PSOLA in all the above features except for computational load. HNM allows for high-quality speech synthesis without smoothing problems at the segmental boundaries and without buzziness or other oddities observed with TD-PSOLA
Keywords :
acoustic noise; speech intelligibility; speech synthesis; HNM; TD-PSOLA; buzziness; computational load; database compression; diphone based speech synthesis; formal listening test; harmonic plus noise model; high-quality speech synthesis; intelligibility; naturalness; next generation concatenative text-to-speech synthesizer; pleasantness; segmental boundaries; speech representation; Acoustic noise; Linear predictive coding; Man machine systems; Smoothing methods; Spatial databases; Speech analysis; Speech enhancement; Speech synthesis; Synthesizers; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
ISSN :
1520-6149
Print_ISBN :
0-7803-4428-6
Type :
conf
DOI :
10.1109/ICASSP.1998.674420
Filename :
674420
Link To Document :
بازگشت