• DocumentCode
    1669110
  • Title

    The gesturer is the speaker

  • Author

    Gebre, Binyam Gebrekidan ; Wittenburg, Peter ; Heskes, Tom

  • Author_Institution
    Max Planck Inst. for Psycholinguistics, Netherlands
  • fYear
    2013
  • Firstpage
    3751
  • Lastpage
    3755
  • Abstract
    We present and solve the speaker diarization problem in a novel way. We hypothesize that the gesturer is the speaker and that identifying the gesturer can be taken as identifying the active speaker. We provide evidence in support of the hypothesis from gesture literature and audio-visual synchrony studies. We also present a vision-only diarization algorithm that relies on gestures (i.e. upper body movements). Experiments carried out on 8.9 hours of a publicly available dataset (the AMI meeting data) show that diarization error rates as low as 15% can be achieved.
  • Keywords
    gesture recognition; speaker recognition; AMI meeting data; active speaker; diarization error rates; gesturer; speaker diarization problem; Abstracts; Gesturer diarisation; Speaker diarisation; Speaker segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6638359
  • Filename
    6638359