مرکز منطقه ای اطلاع رساني علوم و فناوري - Enhanced Speech Features by Single-Channel Joint Compensation of Noise and Reverberation

DocumentCode :

1066657

Title :

Enhanced Speech Features by Single-Channel Joint Compensation of Noise and Reverberation

Author :

Wölfel, Matthias

Author_Institution :

Univ. Karlsruhe, Karlsruhe

Volume :

Issue :

fYear :

2009

Firstpage :

312

Lastpage :

323

Abstract :

For a natural verbal communication between humans and machines, automatic speech recognition, which works reasonably well on recordings captured with mid- or far-field microphones, is essential. While a lot of research and development are devoted to address one of the two distortions frequently encountered in mid- and far-field sound pickup, namely noise or reverberation, less effort has been undertaken to jointly combat both kinds of distortions. In our view, however, this is essential to further reduce the demolishing effect by moving the microphone away from the speaker´s mouth because in real environments both kinds of distortions are present. In this paper, we propose a first step into this direction by integrating an estimate of the reverberation energy derived by an auxiliary model based on multistep linear prediction, into a framework, which, so far tracks and removes nonstationary additive distortion by particle filters in a low-dimension logarithmic power frequency domain. On actual recordings with different speaker-to-microphone distances, we observe that combating, in the feature space, either nonstationary noise or reverberation alone, on a single channel, is already able to improve speech recognition performance before and after acoustic model adaptation. Furthermore, we observe that a simple concatenation of techniques addressing either additive noise or reverberation can further improve the accuracy in some cases. Last but not least, we demonstrate that the joint estimation and removal of both kinds of distortions, as proposed in this publication, further improve the accuracy of the text output.

Keywords :

iterative methods; particle filtering (numerical methods); reverberation; speech recognition; automatic speech recognition; enhanced speech features; low dimension logarithmic power frequency domain; nonstationary noise; particle filters; reverberation alone; single channel; single-channel joint compensation; Acoustic distortion; Acoustic noise; Automatic speech recognition; Humans; Microphones; Mouth; Research and development; Reverberation; Speech enhancement; Working environment noise; Automatic speech recognition (ASR); joint removal of additive and reverberant distortions; multistep linear prediction (MSLP); particle filter; speech feature enhancement;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2008.2009161

Filename :

4749462

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1066657