DocumentCode
2933257
Title
Vocal tract modelling with recurrent neural networks
Author
Burrows, T.L. ; Niranjan, M.
Author_Institution
Dept. of Eng., Cambridge Univ., UK
Volume
5
fYear
1995
fDate
9-12 May 1995
Firstpage
3315
Abstract
The speech production system is modelled using true glottal excitation as the source and a recurrent neural network to represent the vocal tract. The hidden nodes have multiple delays of one and two samples, making the network equivalent to a parallel formant synthesiser in the linear regions of the hidden node sigmoids. An ARX model identification is carried out to initialise the neural network parameters. These parameters are re-estimated in an analysis-by-synthesis framework to minimise the synthesis (output) error. Unlike other analysis-by-synthesis speech production models such as CELP, the source and filter in this approach are decoupled, enabling manipulation of the source time-scale to achieve high quality pitch changes
Keywords
IIR filters; delays; digital filters; error analysis; parameter estimation; recurrent neural nets; speech processing; speech synthesis; ARX model identification; analysis-by-synthesis framework; filter; hidden node sigmoids; linear regions; multiple delays; parallel formant synthesiser; pitch changes; recurrent neural networks; source time-scale; speech production; synthesis error; true glottal excitation; vocal tract modelling; Acoustic distortion; Network synthesis; Neural networks; Nonlinear distortion; Nonlinear filters; Production systems; Recurrent neural networks; Speech analysis; Speech synthesis; Vocoders;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location
Detroit, MI
ISSN
1520-6149
Print_ISBN
0-7803-2431-5
Type
conf
DOI
10.1109/ICASSP.1995.479694
Filename
479694
Link To Document