• DocumentCode
    730787
  • Title

    A study on joint beamforming and spectral enhancement for robust speech recognition in reverberant environments

  • Author

    Feifei Xiong ; Meyer, Bernd T. ; Goetze, Stefan

  • Author_Institution
    Project Group Hearing, Speech & Audio Technol. (HSA), Fraunhofer Inst. for Digital Media Technol. IDMT, Oldenburg, Germany
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5043
  • Lastpage
    5047
  • Abstract
    This work evaluates multi-microphone beamforming and single-microphone spectral enhancement strategies to alleviate the reverberation effect for robust automatic speech recognition (ASR) systems in different reverberant environments characterized by different reverberation times T60 and direct-to-reverberation ratios (DRRs). The systems consist of minimum variance distortionless response (MVDR) beamformers in combination with minimum mean square error (MMSE) estimators, and late reverberation spectral variance (LRSV) estimators, the latter employing a generalized model of the room impulse response (RIR). Various system architectures are analyzed with a focus on optimal speech recognition performance. The system combining an MVDR beamformer and a subsequent MMSE estimator was found to lead to the best results, with relative reductions of 27.7% compared to the baseline system. This is attributed to a more accurate LRSV estimate from spatial averaging and diffuse field refinement for the MMSE estimator.
  • Keywords
    array signal processing; least mean squares methods; microphone arrays; reverberation; speech recognition; ASR; DRR; LRSV; MMSE; MVDR; RIR; diffuse field refinement; direct-to-reverberation ratios; joint beamforming; late reverberation spectral variance estimators; minimum mean square error estimators; minimum variance distortionless response beamformers; multimicrophone beamforming; reverberant environments; robust automatic speech recognition systems; room impulse response; single-microphone spectral enhancement; spatial averaging; spectral enhancement; Array signal processing; Coherence; Lead; Process control; Speech; Xenon; Speech dereverberation; late reverberation spectral variance; minimum mean square error estimator; minimum variance distortionless response beamformer; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178931
  • Filename
    7178931