DocumentCode :
2854403
Title :
Increased mfcc filter bandwidth for noise-robust phoneme recognition
Author :
Skowronski, Mark D. ; Harris, John G.
Author_Institution :
Computational Neuro-Engineering Laboratory, University of Florida, USA
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
Many speech recognition systems use mel-frequency cepstral coefficient (mfcc) feature extraction as a front end. In the algorithm, a speech spectrum passes through a filter bank of mel-spaced triangular filters, and the filter output energies are log-compressed and transformed to the cepstral domain by the OCT. The spacing of filter bank center frequencies mimics the known warped-frequency characteristics of the human auditory system, yet the bandwidths of these filters is not chosen through biological inspiration. Instead they are set by aligning endpoints of the triangle, which is itself an arbitrary shape. It is surprising that for such a popular speech recognition front end, proper analysis or optimization of the filter bandwidths has not been performed. With complex cochlear models, realistic filter shapes that more closely approximate critical bands are used. And these filters, compared to the filters used in mfcc, are considerably wider and overlap with neighboring filters more. We have extended this filter characteristic to the mfcc algorithm and found that the increased filter bandwidth improves recognition performance in clean speech and provides added noise robustness as well.
Keywords :
Feature extraction; System-on-a-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743839
Filename :
5743839
Link To Document :
بازگشت