DocumentCode
1806337
Title
Audio-visual face detection for tracking in a meeting room environment
Author
Barnard, Mark ; Wenwu Wang ; Kittler, Josef ; Naqvi, Syed Mohsen ; Chambers, Jonathon
Author_Institution
Centre for Visions, Speech & Signal Process. (CVSSP), Univ. of Surrey, Guildford, UK
fYear
2013
fDate
9-12 July 2013
Firstpage
1222
Lastpage
1227
Abstract
A key task in many applications such as tracking or face recognition is the detection and localisation of a subject´s face in an image. This can still prove to be a challenging task particularly in low resolution or noisy images. Here we propose a robust method for face detection using both audio and visual information. We construct a dictionary learning based face detector using a set of distinctive and robust image features. We then train a support vector machine classifier using sparse image representations produced by this dictionary to classify face versus background. This is combined with the azimuth angle of the speaker produced by an audio localisation system to constrain the search space for the subject´s face. This increases the efficiency of the detection and localisation process by limiting the search area. However, more importantly, the audio information allows us to know a priori the number of subjects in the image. This greatly reduces the possibility of false positive face detections. We demonstrate the advantage of this proposed approach over traditional face detection methods on the challenging AV16.3 dataset.
Keywords
face recognition; feature extraction; image classification; learning (artificial intelligence); object detection; object tracking; speaker recognition; support vector machines; audio information; audio localisation system; audio-visual face detection; detection process; dictionary learning based face detector; face recognition; false positive face detections; image features; localisation process; low resolution image; noisy image; object tracking; sparse image representations; speaker azimuth angle; support vector machine classifier; visual information; Dictionaries; Face; Face detection; Feature extraction; Histograms; Vectors; Visualization;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Fusion (FUSION), 2013 16th International Conference on
Conference_Location
Istanbul
Print_ISBN
978-605-86311-1-3
Type
conf
Filename
6641136
Link To Document