DocumentCode :
118025
Title :
Speaker diarization based on audio-visual integration for smart posterboard
Author :
Wakabayashi, Yukoh ; Inoue, Koji ; Yoshimoto, Hiromasa ; Kawahara, Tatsuya
Author_Institution :
Acad. Center for Comput. & Media Studies, Kyoto Univ., Kyoto, Japan
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
1
Lastpage :
4
Abstract :
We present a speaker diarization method based on an audio-visual integration approach. We deal with poster conversations which are more challenging than general meetings, because participants are moving freely and the audience utter infrequently. In this case, it is difficult to detect "who spoke when" by only using acoustic information. Therefore we incorporate visual information to improve diarization accuracy. We propose two integration methods: rule-based and stochastic method. Experiments in real poster conversations show that the integration methods significantly outperform the baseline method which uses acoustic information only.
Keywords :
audio signals; audio-visual systems; speaker recognition; video signals; acoustic information; audio-visual integration approach; baseline method; poster conversation; rule-based integration method; smart posterboard; speaker diarization method; stochastic integration method; visual information; Acoustics; Direction-of-arrival estimation; Microphones; Multiple signal classification; Phasor measurement units; Speech; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
Conference_Location :
Siem Reap
Type :
conf
DOI :
10.1109/APSIPA.2014.7041584
Filename :
7041584
Link To Document :
بازگشت