مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-channel speech processing architectures for noise robust speech recognition: 3rd CHiME challenge results

DocumentCode :

3744880

Title :

Multi-channel speech processing architectures for noise robust speech recognition: 3rd CHiME challenge results

Author :

Lukas Pfeifenberger;Tobias Schrank;Matthias Zohrer;Martin Hagm?ller;Franz Pernkopf

Author_Institution :

Signal Processing and Speech Communication Laboratory, Graz University of Technology, Graz, Austria

fYear :

2015

Firstpage :

452

Lastpage :

459

Abstract :

Recognizing speech under noisy condition is an ill-posed problem. The CHiME 3 challenge targets robust speech recognition in realistic environments such as street, bus, caffee and pedestrian areas. We study variants of beamformers used for pre-processing multi-channel speech recordings. In particular, we investigate three variants of generalized side-lobe canceller (GSC) beamformers, i.e. GSC with sparse blocking matrix (BM), GSC with adaptive BM (ABM), and GSC with minimum variance distortionless response (MVDR) and ABM. Furthermore, we apply several post-filters to further enhance the speech signal. We introduce MaxPower postfilters and deep neural postfilters (DPFs). DPFs outperformed our baseline systems significantly when measuring the overall perceptual score (OPS) and the perceptual evaluation of speech quality (PESQ). In particular DPFs achieved an average relative improvement of 17.54% OPS points and 18.28% in PESQ, when compared to the CHiME 3 baseline. DPFs also achieved the best WER when combined with an ASR engine on simulated development and evaluation data, i.e. 8.98% and 10.82% WER. The proposed MaxPower beamformer achieved the best overall WER on CHiME 3 real development and evaluation data, i.e. 14.23% and 22.12%, respectively.

Keywords :

"Speech","Speech recognition","Microphones","Artificial neural networks","Speech enhancement","Array signal processing"

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on

Type :

conf

DOI :

10.1109/ASRU.2015.7404830

Filename :

7404830

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3744880