DocumentCode :
706298
Title :
Early auditory processing inspired features for robust automatic speech recognition
Author :
Kalinli, Ozlem ; Narayanan, Shrikanth
Author_Institution :
Dept. of Electr. Eng.-Syst., Univ. of Southern California, Los Angeles, CA, USA
fYear :
2007
fDate :
3-7 Sept. 2007
Firstpage :
2385
Lastpage :
2389
Abstract :
In this paper, we derive bio-inspired features for automatic speech recognition based on the early processing stages in the human auditory system. The utility and robustness of the derived features are validated in a speech recognition task under a variety of noise conditions. First, we develop an auditory based feature by replacing the filterbank analysis stage of Mel-frequency cepstral coefficients (MFCC) feature extraction with an auditory model that consists of cochlear filtering, inner hair cell, and lateral inhibitory network stages. Then, we propose a new feature set that retains only the cochlear channel outputs that are more likely to fire the neurons in the central auditory system. This feature set is extracted by principal component analysis (PCA) of nonlinearly compressed early auditory spectrum. When evaluated in a connected digit recognition task using the Aurora 2.0 database, the proposed feature set has 40% and 18% average word error rate improvement relative to the MFCC and RelAtive SpecTrAl (RASTA) features, respectively.
Keywords :
channel bank filters; feature extraction; principal component analysis; speech recognition; Aurora 2.0 database; MFCC; MFCC feature extraction; Mel-frequency cepstral coefficients; PCA; RASTA; auditory processing inspired features; bioinspired features; central auditory system; cochlear channel outputs; cochlear filtering; digit recognition task; filter bank analysis stage; human auditory system; inner hair cell; lateral inhibitory network stages; noise conditions; principal component analysis; relative spectral features; robust automatic speech recognition; Auditory system; Feature extraction; Mel frequency cepstral coefficient; Noise; Robustness; Speech; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference, 2007 15th European
Conference_Location :
Poznan
Print_ISBN :
978-839-2134-04-6
Type :
conf
Filename :
7099235
Link To Document :
بازگشت