DocumentCode :
1669110
Title :
The gesturer is the speaker
Author :
Gebre, Binyam Gebrekidan ; Wittenburg, Peter ; Heskes, Tom
Author_Institution :
Max Planck Inst. for Psycholinguistics, Netherlands
fYear :
2013
Firstpage :
3751
Lastpage :
3755
Abstract :
We present and solve the speaker diarization problem in a novel way. We hypothesize that the gesturer is the speaker and that identifying the gesturer can be taken as identifying the active speaker. We provide evidence in support of the hypothesis from gesture literature and audio-visual synchrony studies. We also present a vision-only diarization algorithm that relies on gestures (i.e. upper body movements). Experiments carried out on 8.9 hours of a publicly available dataset (the AMI meeting data) show that diarization error rates as low as 15% can be achieved.
Keywords :
gesture recognition; speaker recognition; AMI meeting data; active speaker; diarization error rates; gesturer; speaker diarization problem; Abstracts; Gesturer diarisation; Speaker diarisation; Speaker segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638359
Filename :
6638359
Link To Document :
بازگشت