DocumentCode
626520
Title
Auditory features based on Gammatone filters for robust speech recognition
Author
Jun Qi ; Dong Wang ; Yi Jiang ; Runsheng Liu
Author_Institution
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
fYear
2013
fDate
19-23 May 2013
Firstpage
305
Lastpage
308
Abstract
A major challenge for automatic speech recognition (ASR) relates to significant performance reduction in noisy environments. Recent research has shown that auditory features based on Gammatone filters are promising to improve robustness of ASR systems against noise, though the research is far from extensive and generalizability of the new features is unknown. This paper presents our implementation of the Gamma-tone filter-based feature and the experimental results on Mandarin speech data. By some thorough designs, we obtained significant performance gains with the new feature in various noise conditions when compared with the widely used MFCC and PLP features. A particular novelty of our implementation is that the filter design is purely in the time domain. This means that the channel signals are obtained with a set of Gammatone filters applied directly on the speech signals in time domain, which is totally different from the commonly adopted frequency-domain design that first converts signals to spectra and then applies the filter banks upon them. The time-domain implementation on the one hand avoids the approximation introduced by short-time spectral analysis and hence is more precise; and on the other hand, it avoids the complex spectral computation and hence simplifies hardware realization.
Keywords
audio signal processing; digital filters; feature extraction; spectral analysis; speech recognition; time-domain analysis; ASR; Gammatone filters; Mandarin speech data; automatic speech recognition; complex spectral computation; hardware realization; noisy environments; performance reduction; short-time spectral analysis; speech signals; time domain implementation; Frequency-domain analysis; Mel frequency cepstral coefficient; Noise; Robustness; Speech; Speech recognition; Time-domain analysis; Gammatone filters; feature extraction; robust speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Circuits and Systems (ISCAS), 2013 IEEE International Symposium on
Conference_Location
Beijing
ISSN
0271-4302
Print_ISBN
978-1-4673-5760-9
Type
conf
DOI
10.1109/ISCAS.2013.6571843
Filename
6571843
Link To Document