• DocumentCode
    590801
  • Title

    Microphone array processing for distant speech recognition: Towards real-world deployment

  • Author

    Kumatani, Kenichi ; Arakawa, Takeshi ; Yamamoto, Koji ; McDonough, John ; Raj, Bhiksha ; Singh, Rajdeep ; Tashev, I.

  • Author_Institution
    Disney Res., Pittsburgh, PA, USA
  • fYear
    2012
  • fDate
    3-6 Dec. 2012
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Distant speech recognition (DSR) holds out the promise of providing a natural human computer interface in that it enables verbal interactions with computers without the necessity of donning intrusive body- or head-mounted devices. Recognizing distant speech robustly, however, remains a challenge. This paper provides a overview of DSR systems based on microphone arrays. In particular, we present recent work on acoustic beamforming for DSR, along with experimental results verifying the effectiveness of the various algorithms described here; beginning from a word error rate (WER) of 14.3% with a single microphone of a 64-channel linear array, our state-of-the-art DSR system achieved a WER of 5.3%, which was comparable to that of 4.2% obtained with a lapel microphone. Furthermore, we report the results of speech recognition experiments on data captured with a popular device, the Kinect [1]. Even for speakers at a distance of four meters from the Kinect, our DSR system achieved acceptable recognition performance on a large vocabulary task, a WER of 24.1%, beginning from a WER of 42.5% with a single array channel.
  • Keywords
    error statistics; human computer interaction; microphone arrays; speech recognition; 64-channel linear array; DSR systems; Kinect; WER; acoustic beamforming; distant speech recognition; microphone array processing; natural human computer interface; real-world deployment; single array channel; verbal interactions; word error rate; Array signal processing; Arrays; Microphones; Noise; Sensors; Speech recognition; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
  • Conference_Location
    Hollywood, CA
  • Print_ISBN
    978-1-4673-4863-8
  • Type

    conf

  • Filename
    6411948