Title :
Non-linear spectro-temporal modulations for reverberant speech recognition
Author :
Matassoni, Marco ; Maganti, Hari Krishna ; Omologo, Maurizio
Author_Institution :
Center for Inf. Technol., Fondazione Bruno Kessler, Trento, Italy
fDate :
May 30 2011-June 1 2011
Abstract :
This paper introduces a novel set of non-linear spectro-temporal features that improve automatic speech recognition performance in presence of room reverberation. The solution is based on extracting features derived from auditory characteristics, which include gammatone filtering, non-linear processing and modulation spectral processing to emulate the mechanisms performed in the cochlea and middle ear aimed to improve robustness in human ear. Experiments are performed on Aurora-5 meeting recorder digit task (mrd), captured with four different distant microphones in hands-free mode at a real meeting room. For comparison purposes the recognition results obtained using standard conventional features are tested. The experimental results show that the proposed features provide considerable improvements with respect to state of the art feature extraction techniques.
Keywords :
feature extraction; filtering theory; modulation; reverberation; speech recognition; Aurora-5 meeting recorder digit task; automatic speech recognition performance; distant microphones; feature extraction techniques; gammatone filtering; modulation spectral processing; nonlinear processing; nonlinear spectro-temporal modulations; reverberant speech recognition; room reverberation; Feature extraction; Frequency modulation; Mel frequency cepstral coefficient; Robustness; Speech; Speech recognition; Automatic speech recognition; auditory processing; modulation spectrum; non-linearity; reverberation; robustness;
Conference_Titel :
Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 Joint Workshop on
Conference_Location :
Edinburgh
Print_ISBN :
978-1-4577-0997-5
DOI :
10.1109/HSCMA.2011.5942376