Title :
View Independent Computer Lip-Reading
Author :
Lan, Yuxuan ; Theobald, Barry-John ; Harvey, Richard
Author_Institution :
Sch. of Comput. Sci., Univ. of East Anglia, Norwich, UK
Abstract :
Computer lip-reading systems are usually designed to work using a full-frontal view of the face. However, many human experts tend to prefer to lip-read using an angled view. In this paper we consider issues related to the best viewing angle for an automated lip-reading system. In particular, we seek answers to the following questions: (1) Do computers lip-read better using a frontal or a non-frontal view of the face? (2) What is the best viewing angle for a computer lip-reading system? (3) How can a computer lip-reading system be made to work independently of viewing angle? We investigate these issues using a purpose built audio-visual dataset that contains simultaneous recordings of a speaker reciting continuous speech at five angles. We find that the system performs best on a non-frontal view, perhaps because lip gestures, such as lip-protrusion and lip-rounding, are more pronounced when viewing from an angle. We also describe a simple linear mapping that allows us to map any view of the face to the view that we find to be optimal. Hence we present a view-independent lip-reading system.
Keywords :
audio-visual systems; face recognition; gesture recognition; speech recognition; audio-visual dataset; automated lip-reading system; frontal face view; linear mapping; lip gestures; lip-protrusion; lip-rounding; nonfrontal face view; view-independent computer lip-reading system; viewing angle; visual speech recognition; Accuracy; Cameras; Computers; Hidden Markov models; Shape; Speech; Visualization; computer lip-reading; feature mapping; view-independence; visual speech recognition;
Conference_Titel :
Multimedia and Expo (ICME), 2012 IEEE International Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4673-1659-0
DOI :
10.1109/ICME.2012.192