Title :
A visual front-end for a continuous pose-invariant lipreading system
Author :
Lucey, Patrick ; Sridharan, Sridha
Author_Institution :
Image & Video Technol. Lab., Queensland Univ. of Technol., Brisbane, QLD
Abstract :
Having an audio-visual automatic speech recognition (AVASR) system which can recognise what a speaker´s says regardless of head position (i.e. left profile, front, right profile etc.), would be most useful as it enables this technology to be used in a host of realistic applications such as mobile phone and in-vehicle speech recognition. A major hurdle in achieving this goal is in developing a visual front-end which can effectively locate and track a user´s face and facial features from a single camera. In this paper, we describe a visual front-end which incorporates a pose-estimator in conjunction with a parallel series of pose specific face and facial feature classifier based on a boosted cascade of simple classifiers devised by Viola and Jones [6]. Results of our visual front-end are tested on the CUAVE database. We also give lipreading results on the CUAVE database, which shows that AVASR whilst a speaker is moving their head is indeed achievable.
Keywords :
audio-visual systems; face recognition; speech recognition; audio-visual automatic speech recognition system; continuous pose-invariant lipreading system; facial feature classifier; in-vehicle speech recognition; mobile phone; pose-estimator; single camera; visual front-end; Automatic speech recognition; Cameras; Facial features; Head; Mobile handsets; Mouth; Spatial databases; Speech recognition; Testing; Visual databases;
Conference_Titel :
Signal Processing and Communication Systems, 2008. ICSPCS 2008. 2nd International Conference on
Conference_Location :
Gold Coast, QLD
Print_ISBN :
978-1-4244-4243-0
Electronic_ISBN :
978-1-4244-4243-0
DOI :
10.1109/ICSPCS.2008.4813664