Title :
Scale based features for audiovisual speech recognition
Author :
Matthews, I.A. ; Bangham, J.A. ; Cox, S.J.
Author_Institution :
Sch. of Inf. Syst., East Anglia Univ., Norwich, UK
Abstract :
This paper demonstrates the use of nonlinear image decomposition, in the form of a sieve, applied to the task of audiovisual speech recognition of a database of the letters A-Z for ten talkers. A scale based feature vector is formed directly from the grayscale pixels of an image containing the talkers mouth on a per frame basis. This is independent of image amplitude and position information and neither accurate tracking or special markers are required. Results are presented for audio only, visual only and for early and late integrated audiovisual cases
Keywords :
audio-visual systems; audiovisual speech recognition; database; feature vector; grayscale pixels; image amplitude; nonlinear image decomposition; scale based features; sieve; tracking;
Conference_Titel :
Integrated Audio-Visual Processing for Recognition, Synthesis and Communication (Digest No: 1996/213), IEE Colloquium on
Conference_Location :
London
DOI :
10.1049/ic:19961152