• DocumentCode
    2409796
  • Title

    Speech coding by limited weight neural networks (LWNN)

  • Author

    Gas, Bruno ; Zarader, Jean Luc ; Sellem, Philippe ; Didiot, Jean Charles

  • Author_Institution
    Lab. des Instrum. et Syst., Univ. Pierre et Marie Curie, Paris, France
  • Volume
    5
  • fYear
    1997
  • fDate
    12-15 Oct 1997
  • Firstpage
    4081
  • Abstract
    We present a new kind of speech coding. Usually the coding is obtained by a linear predictor LPC (or derivative LAR, LPCC) or by spectral analysis, as FFT, Cepstre or MECC. We propose to use a three layer neural network to learn phonemes extracted from the DARPA-TIMIT database. The network is designed to predict the next input signal value from the N previous ones. During the training stage, the first weight layer is the same for each phoneme. The second weight layer is different for each phoneme. In the generalization stage, the first weight layer remains fixed and initialized with those given by the training phase. When coding a test database phoneme, the output weight layer is trained to predict each phoneme values. The final neural predictive coding (NPC) corresponds to this second weight layer. We show that normalized coding can easily be obtained by using a nonlinear function of the weights instead of the weights themselves. Results are compared with others on temporal speech coding. A study of NPC by discriminant analysis and an application of MLP to phoneme recognition is presented
  • Keywords
    database management systems; feedforward neural nets; generalisation (artificial intelligence); learning (artificial intelligence); multilayer perceptrons; natural language interfaces; speech coding; speech recognition; statistical analysis; DARPA-TIMIT database; discriminant analysis; generalization; input signal value prediction; learning; limited weight neural networks; linear predictor; neural predictive coding; nonlinear function; normalized coding; phoneme recognition; phonemes; spectral analysis; speech coding; temporal speech coding; test database phoneme; three layer neural network; training stage; Databases; Ear; Filters; Instruments; Linear predictive coding; Multilayer perceptrons; Neural networks; Signal generators; Signal processing; Speech coding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation., 1997 IEEE International Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    1062-922X
  • Print_ISBN
    0-7803-4053-1
  • Type

    conf

  • DOI
    10.1109/ICSMC.1997.637335
  • Filename
    637335