DocumentCode :
2053220
Title :
Channel selection based on multichannel cross-correlation coefficients for distant speech recognition
Author :
Kumatani, Kenichi ; McDonough, John ; Lehman, Jill Fain ; Raj, Bhiksha
Author_Institution :
Disney Res., Pittsburgh, Pittsburgh, PA, USA
fYear :
2011
fDate :
May 30 2011-June 1 2011
Firstpage :
1
Lastpage :
6
Abstract :
In theory, beamforming performance can be improved by using as many microphones as possible, but in practice it has been shown that using all possible channels does not always improve speech recognition performance. In this work, we present a new channel selection method in order to increase the computational efficiency of beamforming for distant speech recognition (DSR) without sacrficing performance. To achieve better performance, we treat a channel that is uncor related with the others as unreliable and choose a subset of micro phones whose signals are most highly correlated with each other. We use the multichannel cross-correlation coefficient (MCCC) as a measure for selecting the reliable channels. The selected channels are then used for beamforming. We evaluate our channel selection technique with DSR experiments on real children´s speech data captured using a linear array with 64 microphones. A single distant microphone provided a word error rate (WER) of 15.4%, which was reduced to 8.5% by super directive beamforming with all the sensors. The experimental results suggest that almost the same recognition performance can be obtained with half the number of sensors in the case of super-directive beamforming. Maximum kurtosis beamforming with 48 sensors out of a total of 64 achieved a WER of 5.7%, which is very comparable to the 5.2% WER obtained with a close-talking microphone.
Keywords :
array signal processing; speech recognition; DSR; MCCC; WER; beamforming computational efficiency; channel selection method; distant speech recognition; linear array; microphones; multichannel cross-correlation coefficients; super-directive beamforming; word error rate; Array signal processing; Delay effects; Microphone arrays; Sensors; Speech; Speech recognition; beamforming; channel selection; microphone arrays; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 Joint Workshop on
Conference_Location :
Edinburgh
Print_ISBN :
978-1-4577-0997-5
Type :
conf
DOI :
10.1109/HSCMA.2011.5942398
Filename :
5942398
Link To Document :
بازگشت