DocumentCode
3626866
Title
Multimodal Meeting Monitoring: Improvements on Speaker Tracking and Segmentation through a Modified Mixture Particle Filter
Author
Viktor Rozgic;Carlos Busso;Panayotis G. Georgiou;Shrikanth Narayanan
Author_Institution
Department of Electrical Engineering, Speech Analysis and Interpretation Laboratory, University of Southern California, Viterbi School of Engineering, E-mail: rozgic@usc.edu
fYear
2007
Firstpage
60
Lastpage
65
Abstract
In this paper we address improvements to our multimodal system for tracking of meeting participants and speaker segmentation with a focus on the microphone array modality. We propose an algorithm that uses Directions-of-Arrival estimated for each microphone pair as observations and performs tracking of an unknown number of acoustically-active meeting participants and subsequent speaker segmentation. We propose modified mixture particle fillter (mMPF) for tracking of acoustic sources in the track-before-detection (TbD) framework. Trajectories of sound sources are reconstructed by the optimal assignment of posterior mixture components produced by mMPF in consecutive frames. Further, we propose a sequential optimal change-point detection algorithm which discovers speech segments in the reconstructed trajectories i.e., performs speaker segmentation. The algorithm is tested on a multi-participant meeting dataset both separately and as a part of the multimodal system. On the task of speaker detection in the multimodal setup we report significant improvement over our previous state of the art implementation.
Keywords
"Monitoring","Particle tracking","Particle filters","Microphone arrays","Loudspeakers","Acoustic signal detection","Speech","Phase estimation","Trajectory","Cameras"
Publisher
ieee
Conference_Titel
Multimedia Signal Processing, 2007. MMSP 2007. IEEE 9th Workshop on
Print_ISBN
978-1-4244-1273-0;978-1-4244-1274-7
Type
conf
DOI
10.1109/MMSP.2007.4412818
Filename
4412818
Link To Document