Title :
Applying the harmonic plus noise model in concatenative speech synthesis
Author :
Stylianou, Yannis
Author_Institution :
Shannon Labs., AT&T Labs.-Res., Florham Park, NJ, USA
fDate :
1/1/2001 12:00:00 AM
Abstract :
This paper describes the application of the harmonic plus noise model (HNM) for concatenative text-to-speech (TTS) synthesis. In the context of HNM, speech signals are represented as a time-varying harmonic component plus a modulated noise component. The decomposition of a speech signal into these two components allows for more natural-sounding modifications of the signal (e.g., by using different and better adapted schemes to modify each component). The parametric representation of speech using HNM provides a straightforward way of smoothing discontinuities of acoustic units around concatenation points. Formal listening tests have shown that HNM provides high-quality speech synthesis while outperforming other models for synthesis (e.g., TD-PSOLA) in intelligibility, naturalness, and pleasantness
Keywords :
acoustic signal processing; harmonics; noise; signal representation; smoothing methods; speech intelligibility; speech synthesis; acoustic units; adapted schemes; concatenative text-to-speech synthesis; discontinuities smoothing; formal listening tests; harmonic plus noise model; high-quality speech synthesis; modulated noise component; natural-sounding signal modifications; parametric speech representation; speech intelligibility; speech naturalness; speech pleasantness; speech signal decomposition; speech signals representation; time-varying harmonic component; Acoustic noise; Context modeling; Degradation; Filters; Linear predictive coding; Phase estimation; Signal synthesis; Speech processing; Speech synthesis; Transaction databases;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on