Title :
Speaker diarization based on audio-visual integration for smart posterboard
Author :
Wakabayashi, Yukoh ; Inoue, Koji ; Yoshimoto, Hiromasa ; Kawahara, Tatsuya
Author_Institution :
Acad. Center for Comput. & Media Studies, Kyoto Univ., Kyoto, Japan
Abstract :
We present a speaker diarization method based on an audio-visual integration approach. We deal with poster conversations which are more challenging than general meetings, because participants are moving freely and the audience utter infrequently. In this case, it is difficult to detect "who spoke when" by only using acoustic information. Therefore we incorporate visual information to improve diarization accuracy. We propose two integration methods: rule-based and stochastic method. Experiments in real poster conversations show that the integration methods significantly outperform the baseline method which uses acoustic information only.
Keywords :
audio signals; audio-visual systems; speaker recognition; video signals; acoustic information; audio-visual integration approach; baseline method; poster conversation; rule-based integration method; smart posterboard; speaker diarization method; stochastic integration method; visual information; Acoustics; Direction-of-arrival estimation; Microphones; Multiple signal classification; Phasor measurement units; Speech; Visualization;
Conference_Titel :
Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
Conference_Location :
Siem Reap
DOI :
10.1109/APSIPA.2014.7041584