DocumentCode
697747
Title
Eigenresiduals for improved parametric speech synthesis
Author
Drugman, Thomas ; Wilfart, Geoffrey ; Dutoit, Thierry
Author_Institution
TCTS Lab., Fac. Polytech. de Mons, Mons, Belgium
fYear
2009
fDate
24-28 Aug. 2009
Firstpage
2176
Lastpage
2180
Abstract
Statistical parametric speech synthesizers have recently shown their ability to produce natural-sounding and flexible voices. Unfortunately the delivered quality suffers from a typical buzziness due to the fact that speech is vocoded. This paper proposes a new excitation model in order to reduce this undesirable effect. This model is based on the decomposition of pitch-synchronous residual frames on an orthonormal basis obtained by Principal Component Analysis. This basis contains a limited number of eigenresiduals and is computed on a relatively small speech database. A stream of PCA-based coefficients is added to our HMM-based synthesizer and allows to generate the voiced excitation during the synthesis. An improvement compared to the traditional excitation is reported while the synthesis engine footprint remains under about 1Mb.
Keywords
audio databases; eigenvalues and eigenfunctions; hidden Markov models; principal component analysis; speech synthesis; HMM-based synthesizer; PCA; eigenresidual; excitation model; parametric speech synthesis; pitch-synchronous residual frame decomposition; principal component analysis; speech database; Computational modeling; Databases; Hidden Markov models; Principal component analysis; Speech; Speech synthesis; Synthesizers;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference, 2009 17th European
Conference_Location
Glasgow
Print_ISBN
978-161-7388-76-7
Type
conf
Filename
7077264
Link To Document