On the training strategies of neural networks for speech recognition

Author

Gurgen, Fikret S. ; Aikawa, Kiyoaki ; Shikano, Kiyohiro

Author_Institution

Dept. of Comput. Eng., Bosphorus Univ., Istanbul, Turkey

Volume

4

fYear

1992

fDate

7-11 Jun 1992

Firstpage

749

Abstract

The authors investigate how to introduce invariant features to speech recognition neural networks using conventional back propagation (BP), K-neighbor interpolation training (KNIT) with a number of time-shifted examples (TSEs) of the same training sample. The TSEs are employed for training of a multilayer perceptron (MLP) and a time-delay neural network (TDNN) structure to enrich the training sample set covering a larger area of phoneme sample space. Speaker-dependent phoneme recognition experiments were performed. The advantages and disadvantages of using time-shifted examples of a training sample for a MLP and a TDNN structure and a BP and a KNIT algorithm are discussed

Keywords

backpropagation; delays; feedforward neural nets; speech recognition; K-neighbor interpolation training; conventional back propagation; invariant features; multilayer perceptron; neural networks; phoneme sample space; speech recognition; time-delay neural network; time-shifted examples; training strategies; Acoustical engineering; Acoustics; Feature extraction; Humans; Interpolation; Multi-layer neural network; Multilayer perceptrons; Neural networks; Speech recognition; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Networks, 1992. IJCNN., International Joint Conference on

Conference_Location

Baltimore, MD

Print_ISBN

0-7803-0559-0

Type

conf

DOI

10.1109/IJCNN.1992.227228

Filename

227228