Speech coding by limited weight neural networks (LWNN)

Author

Gas, Bruno ; Zarader, Jean Luc ; Sellem, Philippe ; Didiot, Jean Charles

Author_Institution

Lab. des Instrum. et Syst., Univ. Pierre et Marie Curie, Paris, France

Volume

5

fYear

1997

fDate

12-15 Oct 1997

Firstpage

4081

Abstract

We present a new kind of speech coding. Usually the coding is obtained by a linear predictor LPC (or derivative LAR, LPCC) or by spectral analysis, as FFT, Cepstre or MECC. We propose to use a three layer neural network to learn phonemes extracted from the DARPA-TIMIT database. The network is designed to predict the next input signal value from the N previous ones. During the training stage, the first weight layer is the same for each phoneme. The second weight layer is different for each phoneme. In the generalization stage, the first weight layer remains fixed and initialized with those given by the training phase. When coding a test database phoneme, the output weight layer is trained to predict each phoneme values. The final neural predictive coding (NPC) corresponds to this second weight layer. We show that normalized coding can easily be obtained by using a nonlinear function of the weights instead of the weights themselves. Results are compared with others on temporal speech coding. A study of NPC by discriminant analysis and an application of MLP to phoneme recognition is presented

Keywords

database management systems; feedforward neural nets; generalisation (artificial intelligence); learning (artificial intelligence); multilayer perceptrons; natural language interfaces; speech coding; speech recognition; statistical analysis; DARPA-TIMIT database; discriminant analysis; generalization; input signal value prediction; learning; limited weight neural networks; linear predictor; neural predictive coding; nonlinear function; normalized coding; phoneme recognition; phonemes; spectral analysis; speech coding; temporal speech coding; test database phoneme; three layer neural network; training stage; Databases; Ear; Filters; Instruments; Linear predictive coding; Multilayer perceptrons; Neural networks; Signal generators; Signal processing; Speech coding;

fLanguage

English

Publisher

ieee

Conference_Titel

Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation., 1997 IEEE International Conference on

Conference_Location

Orlando, FL

ISSN

1062-922X

Print_ISBN

0-7803-4053-1

Type

conf

DOI

10.1109/ICSMC.1997.637335

Filename

637335