DocumentCode :
3527583
Title :
Acoustic model combination to compensate for residual noise in multi-channel source separation
Author :
Yoon, Jae Sam ; Park, Ji Hun ; Kim, Hong Kook
Author_Institution :
Dept. of Inf. & Commun., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
3925
Lastpage :
3928
Abstract :
In this paper, we propose an acoustic model combination technique for reducing a mismatch in a multi-channel noisy environment. To this end, we first apply a mask-based multi-channel source separation method, typically computational auditory scene analysis (CASA), to separate the speech source from noise. However, a certain degree of noise remains in the separated speech source, especially under low signal-to-noise ratio (SNR) conditions since the estimated mask is not ideal. Thus, the performance of automatic speech recognition (ASR) is limited. To improve ASR performance, the remaining noise can be further compensated in the acoustic model domain under a framework of parallel model combination. In particular, a noise model for PMC is estimated from the noise remained after application of the mask-based source separation, and SNR for PMC is also estimated based on the average of relative magnitude of mask along the utterance. It is shown from the experiments that the proposed acoustic model combination method relatively reduces the word error rate by 52.14% compared to mask-based source separation alone.
Keywords :
acoustic signal processing; source separation; speech recognition; ASR performance; PMC approach; acoustic model combination method; automatic speech recognition; computational auditory scene analysis; mask-based multichannel source separation; parallel model combination; residual noise; Acoustic noise; Automatic speech recognition; Image analysis; Noise reduction; Signal to noise ratio; Source separation; Speech analysis; Speech coding; Speech enhancement; Working environment noise; Speech recognition; computational auditory scene analysis; mask-based SNR estimation; mask-based noise model estimation; multi-channel source separation; parallel model combination;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960486
Filename :
4960486
Link To Document :
بازگشت