Title :
Incorporating localisation cues in a fragment decoding framework for distant binaural speech recognition
Author :
Ning Ma ; Barker, J. ; Christensen, Helen ; Green, P.
Author_Institution :
Dept. of Comput. Sci., Univ. of Sheffield, Sheffield, UK
fDate :
May 30 2011-June 1 2011
Abstract :
This paper addresses the problem of speech recognition using distant microphones in reverberant multisource noise conditions. Specifically, the experiments employ recordings of a noisy domestic living room made using a pair of microphones in a binaural configuration, to which target speech has been added after convolution with binaural room impulse responses. Our scheme employs two stages: first spectro-temporal acoustic source fragments are located using signal level cues, and second, a top-down hypothesis-driven stage simultaneously searches for themost probable allocation of fragments to target or masker and the corresponding acoustic model state sequence. The paper reports a first attempt to use of binaural localisation cues within this framework. Our initial experiments with localisation cues have not improved the baseline performance that uses single channel source separation cues alone. The paper discusses potential reasons for the lack of improvement and suggests fresh ideas that may prove more successful.
Keywords :
microphones; speech recognition; binaural room impulse response; distant binaural speech recognition; distant microphone; fragment decoding framework; reverberant multisource noise condition; single channel source separation;
Conference_Titel :
Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 Joint Workshop on
Conference_Location :
Edinburgh
Print_ISBN :
978-1-4577-0997-5
DOI :
10.1109/HSCMA.2011.5942400