DocumentCode
590801
Title
Microphone array processing for distant speech recognition: Towards real-world deployment
Author
Kumatani, Kenichi ; Arakawa, Takeshi ; Yamamoto, Koji ; McDonough, John ; Raj, Bhiksha ; Singh, Rajdeep ; Tashev, I.
Author_Institution
Disney Res., Pittsburgh, PA, USA
fYear
2012
fDate
3-6 Dec. 2012
Firstpage
1
Lastpage
10
Abstract
Distant speech recognition (DSR) holds out the promise of providing a natural human computer interface in that it enables verbal interactions with computers without the necessity of donning intrusive body- or head-mounted devices. Recognizing distant speech robustly, however, remains a challenge. This paper provides a overview of DSR systems based on microphone arrays. In particular, we present recent work on acoustic beamforming for DSR, along with experimental results verifying the effectiveness of the various algorithms described here; beginning from a word error rate (WER) of 14.3% with a single microphone of a 64-channel linear array, our state-of-the-art DSR system achieved a WER of 5.3%, which was comparable to that of 4.2% obtained with a lapel microphone. Furthermore, we report the results of speech recognition experiments on data captured with a popular device, the Kinect [1]. Even for speakers at a distance of four meters from the Kinect, our DSR system achieved acceptable recognition performance on a large vocabulary task, a WER of 24.1%, beginning from a WER of 42.5% with a single array channel.
Keywords
error statistics; human computer interaction; microphone arrays; speech recognition; 64-channel linear array; DSR systems; Kinect; WER; acoustic beamforming; distant speech recognition; microphone array processing; natural human computer interface; real-world deployment; single array channel; verbal interactions; word error rate; Array signal processing; Arrays; Microphones; Noise; Sensors; Speech recognition; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location
Hollywood, CA
Print_ISBN
978-1-4673-4863-8
Type
conf
Filename
6411948
Link To Document