DocumentCode
248069
Title
An audiovisual attention model for natural conversation scenes
Author
Coutrot, Antoine ; Guyader, Nathalie
Author_Institution
Gipsa-Lab., Grenoble-Alpes Univ., Grenoble, France
fYear
2014
fDate
27-30 Oct. 2014
Firstpage
1100
Lastpage
1104
Abstract
Classical visual attention models neither consider social cues, such as faces, nor auditory cues, such as speech. However, faces are known to capture visual attention more than any other visual features, and recent studies showed that speech turn-taking affects the gaze of non-involved viewers. In this paper, we propose an audiovisual saliency model able to predict the eye movements of observers viewing other people having a conversation. Thanks to a speaker diarization algorithm, our audiovisual saliency model increases the saliency of the speakers compared to the addressees. We evaluated our model with eye-tracking data, and found that it significantly outperforms visual attention models using an equal and constant saliency value for all faces.
Keywords
audio-visual systems; gaze tracking; image processing; speech processing; audiovisual attention model; audiovisual saliency model; classical visual attention models; eye movement prediction; eye-tracking data; natural conversation scenes; social cues; speaker diarization algorithm; Computational modeling; Feature extraction; Observers; Predictive models; Speech; Videos; Visualization; audiovisual saliency model; eye movements; social gaze; speaker diarization; speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Image Processing (ICIP), 2014 IEEE International Conference on
Conference_Location
Paris
Type
conf
DOI
10.1109/ICIP.2014.7025219
Filename
7025219
Link To Document