DocumentCode :
1686894
Title :
Coupling binary masking and robust ASR
Author :
Narayanan, Arun ; DeLiang Wang
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2013
Firstpage :
6817
Lastpage :
6821
Abstract :
We present a novel framework for performing speech separation and robust automatic speech recognition (ASR) in a unified fashion. Separation is performed by estimating the ideal binary mask (IBM), which identifies speech dominant and noise dominant units in a time-frequency (T-F) representation of the noisy signal. ASR is performed on extracted cepstral features after binary masking. Previous systems perform these steps in a sequential fashion - separation followed by recognition. The proposed framework, which we call bidirectional speech decoding (BSD), unifies these two stages. It does this by using multiple IBM estimators each of which is designed specifically for a back-end acoustic phonetic unit (BPU) of the recognizer. The standard ASR decoder is modified to use these IBM estimators to obtain BPU-specific cepstra during likelihood calculation. On the Aurora-4 robust ASR task, the proposed framework obtains a relative improvement of 17% in word error rate over the noisy baseline. It also obtains significant improvements in the quality of the estimated IBM.
Keywords :
estimation theory; speech coding; speech intelligibility; speech recognition; Aurora-4 robust ASR task; BPU-specific cepstra; BSD; automatic speech recognition; back-end acoustic phonetic unit; bidirectional speech decoding; binary masking; cepstral feature; ideal binary mask; multiple IBM estimator; noise dominant unit; speech dominant unit; speech separation; standard ASR decoder; time-frequency representation; word error rate; Decoding; Estimation; Feature extraction; Noise; Noise measurement; Speech; Speech recognition; Aurora-4; Computational Auditory Scene Analysis; bidirectional speech decoder; noise robust ASR;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638982
Filename :
6638982
Link To Document :
بازگشت