Title :
Mel-Spectrographic Mask Estimation for Missing Data Speech Recognition using Short-Time-Fourier-Transform Ratio Estimators
Author :
Kuhne, Markus ; Togneri, Roberto ; Nordholm, Sven Erik
Author_Institution :
Sch. of Electr., Electron. & Comput. Eng., Western Australia Univ., Nedlands, WA, Australia
Abstract :
This paper adopts the framework of DUET, a recently proposed blind source separation (BSS) method, for speech recognition. Based on the attenuation and delay estimation in stereo signals spectrographic masks are designed to extract a target speaker from a mixture containing multiple speech sources. Instead of using these masks for resynthesis we avoid source reconstruction and propose to combine the source separation with a missing data speech recognizer. The obtained results for connected digit experiments in a multi-speaker environment demonstrate the validity of the approach.
Keywords :
Fourier transforms; blind source separation; delay estimation; speech recognition; speech synthesis; DUET; blind source separation; data speech recognition; delay estimation; mel-spectrographic mask estimation; multi-speaker environment; short time-Fourier-transform ratio estimators; source reconstruction; speech resynthesis; speech sources; stereo signals spectrographic masks; Attenuation; Australia; Automatic speech recognition; Delay estimation; Microphones; Source separation; Speech coding; Speech enhancement; Speech recognition; Time frequency analysis; attenuation; delay estimation; masks; missing data; speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
DOI :
10.1109/ICASSP.2007.366935