DocumentCode :
2701958
Title :
Mel-Spectrographic Mask Estimation for Missing Data Speech Recognition using Short-Time-Fourier-Transform Ratio Estimators
Author :
Kuhne, Markus ; Togneri, Roberto ; Nordholm, Sven Erik
Author_Institution :
Sch. of Electr., Electron. & Comput. Eng., Western Australia Univ., Nedlands, WA, Australia
Volume :
4
fYear :
2007
fDate :
15-20 April 2007
Abstract :
This paper adopts the framework of DUET, a recently proposed blind source separation (BSS) method, for speech recognition. Based on the attenuation and delay estimation in stereo signals spectrographic masks are designed to extract a target speaker from a mixture containing multiple speech sources. Instead of using these masks for resynthesis we avoid source reconstruction and propose to combine the source separation with a missing data speech recognizer. The obtained results for connected digit experiments in a multi-speaker environment demonstrate the validity of the approach.
Keywords :
Fourier transforms; blind source separation; delay estimation; speech recognition; speech synthesis; DUET; blind source separation; data speech recognition; delay estimation; mel-spectrographic mask estimation; multi-speaker environment; short time-Fourier-transform ratio estimators; source reconstruction; speech resynthesis; speech sources; stereo signals spectrographic masks; Attenuation; Australia; Automatic speech recognition; Delay estimation; Microphones; Source separation; Speech coding; Speech enhancement; Speech recognition; Time frequency analysis; attenuation; delay estimation; masks; missing data; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Type :
conf
DOI :
10.1109/ICASSP.2007.366935
Filename :
4218123
Link To Document :
بازگشت