DocumentCode :
3485129
Title :
Maximum kurtosis beamforming with a subspace filter for distant speech recognition
Author :
Kumatani, Kenichi ; McDonough, John ; Raj, Bhiksha
Author_Institution :
Disney Res., Pittsburgh, Pittsburgh, PA, USA
fYear :
2011
fDate :
11-15 Dec. 2011
Firstpage :
179
Lastpage :
184
Abstract :
This paper presents a new beamforming method for distant speech recognition (DSR). The dominant mode subspace is considered in order to efficiently estimate the active weight vectors for maximum kurtosis (MK) beamforming with the generalized sidelobe canceler (GSC). We demonstrated in [1], [2], [3] that the beamforming method based on the maximum kurtosis criterion can remove reverberant and noise effects without signal cancellation encountered in the conventional beamforming algorithms. The MK beamforming algorithm, however, required a relatively large amount of data for reliably estimating the active weight vector because it relies on a numerical optimization algorithm. In order to achieve efficient estimation, we propose to cascade the subspace (eigenspace) filter [4, Section 6.8] with the active weight vector. The subspace filter can decompose the output of the blocking matrix into directional signals and ambient noise components. Then, the ambient noise components are averaged and would be subtracted from the beamformer´s output, which leads to reliable estimation as well as significant computational reduction. We show the effectiveness of our method through a set of distant speech recognition experiments on real microphone array data captured in the real environment. Our new beamforming algorithm provided the best recognition performance among conventional beamforming techniques, a word error rate (WER) of 5.3%, which is comparable to the WER of 4.2% obtained with a close-talking microphone. Moreover, it achieved better recognition performance with a fewer amounts of adaptation data than the conventional MK beamformer.
Keywords :
array signal processing; eigenvalues and eigenfunctions; error statistics; filtering theory; microphone arrays; optimisation; speech recognition; MK beamforming; WER; active weight vector; adaptation data; ambient noise component; blocking matrix; close-talking microphone; directional signal; distant speech recognition; dominant mode subspace; eigenspace filter; generalized sidelobe canceler; maximum kurtosis beamforming; maximum kurtosis criterion; microphone array; noise effect; numerical optimization algorithm; recognition performance; reverberant effect; subspace filter; word error rate; Array signal processing; Eigenvalues and eigenfunctions; Microphones; Noise; Speech; Speech recognition; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
Type :
conf
DOI :
10.1109/ASRU.2011.6163927
Filename :
6163927
Link To Document :
بازگشت