Visual speech recognition by recurrent neural networks

Author

Rabi, Gihad ; Lu, Siwei

Author_Institution

Dept. of Comput. Sci., Memorial Univ. of Newfoundland, St. John´´s, Nfld., Canada

Volume

1

fYear

1997

fDate

25-28 May 1997

Firstpage

55

Abstract

A system for visual speech recognition is described in this paper. In the first phase of the system´s operation, time-varying visual speech patterns are obtained from a sequence of images. In the second phase, the system uses recurrent neural networks to classify the spatio-temporal pattern as one of the previously-trained words. By specifying a certain behavior when a recurrent network is presented with exemplar sequences, the network is trained with no more than feed-forward complexity. The network´s desired behavior is based on characterizing a given word by well-defined segments. Adaptive segmentation is employed to segment the training sequences of a given word

Keywords

feature extraction; image recognition; image segmentation; image sequences; recurrent neural nets; speech recognition; adaptive segmentation; feed-forward complexity; image sequence; recurrent neural networks; spatio-temporal pattern; time-varying visual speech patterns; training sequences; visual speech recognition; Automatic speech recognition; Computer science; Crosstalk; Feedforward systems; Image segmentation; Mouth; Recurrent neural networks; Shape; Speech analysis; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Electrical and Computer Engineering, 1997. Engineering Innovation: Voyage of Discovery. IEEE 1997 Canadian Conference on

Conference_Location

St. Johns, Nfld.

ISSN

0840-7789

Print_ISBN

0-7803-3716-6

Type

conf

DOI

10.1109/CCECE.1997.614788

Filename

614788