Title :
Multimodal Meeting Monitoring: Improvements on Speaker Tracking and Segmentation through a Modified Mixture Particle Filter
Author :
Viktor Rozgic;Carlos Busso;Panayotis G. Georgiou;Shrikanth Narayanan
Author_Institution :
Department of Electrical Engineering, Speech Analysis and Interpretation Laboratory, University of Southern California, Viterbi School of Engineering, E-mail: rozgic@usc.edu
Abstract :
In this paper we address improvements to our multimodal system for tracking of meeting participants and speaker segmentation with a focus on the microphone array modality. We propose an algorithm that uses Directions-of-Arrival estimated for each microphone pair as observations and performs tracking of an unknown number of acoustically-active meeting participants and subsequent speaker segmentation. We propose modified mixture particle fillter (mMPF) for tracking of acoustic sources in the track-before-detection (TbD) framework. Trajectories of sound sources are reconstructed by the optimal assignment of posterior mixture components produced by mMPF in consecutive frames. Further, we propose a sequential optimal change-point detection algorithm which discovers speech segments in the reconstructed trajectories i.e., performs speaker segmentation. The algorithm is tested on a multi-participant meeting dataset both separately and as a part of the multimodal system. On the task of speaker detection in the multimodal setup we report significant improvement over our previous state of the art implementation.
Keywords :
"Monitoring","Particle tracking","Particle filters","Microphone arrays","Loudspeakers","Acoustic signal detection","Speech","Phase estimation","Trajectory","Cameras"
Conference_Titel :
Multimedia Signal Processing, 2007. MMSP 2007. IEEE 9th Workshop on
Print_ISBN :
978-1-4244-1273-0;978-1-4244-1274-7
DOI :
10.1109/MMSP.2007.4412818