DocumentCode
2554287
Title
Multi-modal front-end for speaker activity detection in small meetings
Author
Even, Jani ; Heracleous, Panikos ; Ishi, Carlos ; Hagita, Norihiro
Author_Institution
ATR Intelligent Robotics and Communication Laboratories, Kyoto, Japan
fYear
2011
fDate
25-30 Sept. 2011
Firstpage
536
Lastpage
541
Abstract
Small informal meetings of two to four participants are very common in work environments. For this reason, a convenient way for recording and archiving these meetings is of great interest. In order to efficiently archive such meetings, an important task to address is to keep trace of “who talked when” during a meeting. This paper proposes a new multi-modal approach to tackle this speaker activity detection problem. One of the novelty of the proposed approach is that it uses a human tracker that relies on scanning laser range finders (LRFs) to localize the participants. This choice is especially relevant for robotic applications as robots are often equipped with LRFs for navigation purpose. In the proposed system, a table top microphone array in the center of the meeting room acquires the audio data while the LRF based human tracker monitors the movement of the participants. Then the speaker activity detection is performed using Gaussian mixture models that were trained before hand. An experiment reproducing a meeting configuration demonstrates the performance of the system for speaker activity detection. In particular, the proposed hands free system maintains an good level of performance compared to the use of close talking microphone while participants are simultaneously speaking.
Keywords
Arrays; Humans; Interference; Joints; Microphones; Noise; Robots;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on
Conference_Location
San Francisco, CA
ISSN
2153-0858
Print_ISBN
978-1-61284-454-1
Type
conf
DOI
10.1109/IROS.2011.6095051
Filename
6095051
Link To Document