Title :
Robust digit recognition using phase-dependent time-frequency masking
Author :
Shi, Guangji ; Aarabi, Parham
Author_Institution :
Dept. of Electr. & Comput. Eng., Toronto Univ., Ont., Canada
Abstract :
A technique using the time-frequency phase information of two microphones is proposed to estimate an ideal time-frequency mask using time-delay-of-arrival (TDOA) of the signal of interest. At a signal-to-noise ratio (SNR) of 0dB, the proposed technique using two microphones achieves a digit recognition rate (average over 5 speakers, each speaking 20-30 digits) of 71%. In contrast, delay-and-sum beamforming only achieves a 40% recognition rate with two microphones and 60% with four microphones. Superdirective beamforming achieves a 44% recognition rate with two microphones and 65% with four microphones.
Keywords :
microphones; speech intelligibility; speech recognition; time-frequency analysis; delay-and-sum beamforming; ideal time-frequency mask; microphones; phase-dependent time-frequency masking; robust digit recognition; signal-to-noise ratio; speech recognition; superdirective beamforming; time-delay-of-arrival; time-frequency phase information; Array signal processing; Delay; Frequency domain analysis; Gaussian noise; Independent component analysis; Microphones; Robustness; Speech enhancement; Speech recognition; Time frequency analysis;
Conference_Titel :
Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
Print_ISBN :
0-7803-7965-9
DOI :
10.1109/ICME.2003.1221390