DocumentCode
2066884
Title
Using a nonlinear model to synthesise natural-sounding vowels
Author
Mann, Iain ; Mclaughlin, Steve
Author_Institution
Dept. of Electron. & Electr. Eng., Edinburgh Univ., UK
fYear
2000
fDate
2000
Firstpage
42705
Lastpage
42710
Abstract
This paper describes a nonlinear model that is able to generate vowel sounds of any required duration which also contain jitter and shimmer, and hence are more natural-sounding than the equivalent sounds generated by linear prediction techniques. The model is based on a radial basis function (RBF) neural network, with a global feedback loop. The network is trained by first placing the radial basis centres onto either a subset of the input data or a fixed hyper-lattice structure. The network weights are then found so as to minimise the mean square error between the input data (which will be a stationary vowel sound segment) and the network output. Regularisation is used when calculating the weight values, as this ensures stability when the global feedback loop is connected for synthesis
Keywords
speech synthesis; MSE minimisation; RBF neural network; global feedback loop; hyper-lattice structure; input data subset; jitter; linear prediction techniques; mean square error; natural-sounding vowels synthesis; network output; network training; network weights; nonlinear model; radial basis centres; radial basis function neural network; regularisation; shimmer; spectral characteristics; speech synthesis; stability; stationary vowel sound segment; temporal characteristics; vowel sound duration; vowel sounds generation;
fLanguage
English
Publisher
iet
Conference_Titel
State of the Art in Speech Synthesis (Ref. No. 2000/058), IEE Seminar on
Conference_Location
London
Type
conf
DOI
10.1049/ic:20000329
Filename
846969
Link To Document