DocumentCode
454699
Title
Unsupervised Learning of Overlapped Speech Model Parameters For Multichannel Speech Activity Detection in Meetings
Author
Laskowski, Kornel ; Schultz, Tanja
Author_Institution
Carnegie Mellon Univ., Pittsburgh, PA
Volume
1
fYear
2006
fDate
14-19 May 2006
Abstract
The study of meetings, and multi-party conversation in general, is currently the focus of much attention, calling for more robust and more accurate speech activity detection systems. We present a novel multichannel speech activity detection algorithm, which explicitly models the overlap incurred by participants taking turns at speaking. Parameters for overlapped speech states are estimated during decoding by using and combining knowledge from other observed states in the same meeting, in an unsupervised manner. We demonstrate on the NIST Rich Transcription Spring 2004 data set that the new system almost halves the number of frames missed by a competitive algorithm within regions of overlapped speech. The overall speech detection error on unseen data is reduced by 36% relative
Keywords
Gaussian processes; decoding; speech coding; speech recognition; unsupervised learning; NIST Rich Transcription Spring 2004 data set; decoding; meetings; multichannel speech activity detection; overlapped speech model parameters; unsupervised learning; Acoustic signal detection; Crosstalk; Detection algorithms; Hidden Markov models; Microphones; NIST; Robustness; Speech analysis; Springs; Unsupervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location
Toulouse
ISSN
1520-6149
Print_ISBN
1-4244-0469-X
Type
conf
DOI
10.1109/ICASSP.2006.1660190
Filename
1660190
Link To Document