• DocumentCode
    2933257
  • Title

    Vocal tract modelling with recurrent neural networks

  • Author

    Burrows, T.L. ; Niranjan, M.

  • Author_Institution
    Dept. of Eng., Cambridge Univ., UK
  • Volume
    5
  • fYear
    1995
  • fDate
    9-12 May 1995
  • Firstpage
    3315
  • Abstract
    The speech production system is modelled using true glottal excitation as the source and a recurrent neural network to represent the vocal tract. The hidden nodes have multiple delays of one and two samples, making the network equivalent to a parallel formant synthesiser in the linear regions of the hidden node sigmoids. An ARX model identification is carried out to initialise the neural network parameters. These parameters are re-estimated in an analysis-by-synthesis framework to minimise the synthesis (output) error. Unlike other analysis-by-synthesis speech production models such as CELP, the source and filter in this approach are decoupled, enabling manipulation of the source time-scale to achieve high quality pitch changes
  • Keywords
    IIR filters; delays; digital filters; error analysis; parameter estimation; recurrent neural nets; speech processing; speech synthesis; ARX model identification; analysis-by-synthesis framework; filter; hidden node sigmoids; linear regions; multiple delays; parallel formant synthesiser; pitch changes; recurrent neural networks; source time-scale; speech production; synthesis error; true glottal excitation; vocal tract modelling; Acoustic distortion; Network synthesis; Neural networks; Nonlinear distortion; Nonlinear filters; Production systems; Recurrent neural networks; Speech analysis; Speech synthesis; Vocoders;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
  • Conference_Location
    Detroit, MI
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-2431-5
  • Type

    conf

  • DOI
    10.1109/ICASSP.1995.479694
  • Filename
    479694