DocumentCode :
454699
Title :
Unsupervised Learning of Overlapped Speech Model Parameters For Multichannel Speech Activity Detection in Meetings
Author :
Laskowski, Kornel ; Schultz, Tanja
Author_Institution :
Carnegie Mellon Univ., Pittsburgh, PA
Volume :
1
fYear :
2006
fDate :
14-19 May 2006
Abstract :
The study of meetings, and multi-party conversation in general, is currently the focus of much attention, calling for more robust and more accurate speech activity detection systems. We present a novel multichannel speech activity detection algorithm, which explicitly models the overlap incurred by participants taking turns at speaking. Parameters for overlapped speech states are estimated during decoding by using and combining knowledge from other observed states in the same meeting, in an unsupervised manner. We demonstrate on the NIST Rich Transcription Spring 2004 data set that the new system almost halves the number of frames missed by a competitive algorithm within regions of overlapped speech. The overall speech detection error on unseen data is reduced by 36% relative
Keywords :
Gaussian processes; decoding; speech coding; speech recognition; unsupervised learning; NIST Rich Transcription Spring 2004 data set; decoding; meetings; multichannel speech activity detection; overlapped speech model parameters; unsupervised learning; Acoustic signal detection; Crosstalk; Detection algorithms; Hidden Markov models; Microphones; NIST; Robustness; Speech analysis; Springs; Unsupervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
ISSN :
1520-6149
Print_ISBN :
1-4244-0469-X
Type :
conf
DOI :
10.1109/ICASSP.2006.1660190
Filename :
1660190
Link To Document :
بازگشت