Non-linear spectro-temporal modulations for reverberant speech recognition

Author

Matassoni, Marco ; Maganti, Hari Krishna ; Omologo, Maurizio

Author_Institution

Center for Inf. Technol., Fondazione Bruno Kessler, Trento, Italy

fYear

2011

fDate

May 30 2011-June 1 2011

Firstpage

115

Lastpage

120

Abstract

This paper introduces a novel set of non-linear spectro-temporal features that improve automatic speech recognition performance in presence of room reverberation. The solution is based on extracting features derived from auditory characteristics, which include gammatone filtering, non-linear processing and modulation spectral processing to emulate the mechanisms performed in the cochlea and middle ear aimed to improve robustness in human ear. Experiments are performed on Aurora-5 meeting recorder digit task (mrd), captured with four different distant microphones in hands-free mode at a real meeting room. For comparison purposes the recognition results obtained using standard conventional features are tested. The experimental results show that the proposed features provide considerable improvements with respect to state of the art feature extraction techniques.

Keywords

feature extraction; filtering theory; modulation; reverberation; speech recognition; Aurora-5 meeting recorder digit task; automatic speech recognition performance; distant microphones; feature extraction techniques; gammatone filtering; modulation spectral processing; nonlinear processing; nonlinear spectro-temporal modulations; reverberant speech recognition; room reverberation; Feature extraction; Frequency modulation; Mel frequency cepstral coefficient; Robustness; Speech; Speech recognition; Automatic speech recognition; auditory processing; modulation spectrum; non-linearity; reverberation; robustness;

fLanguage

English

Publisher

ieee

Conference_Titel

Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 Joint Workshop on

Conference_Location

Edinburgh

Print_ISBN

978-1-4577-0997-5

Type

conf

DOI

10.1109/HSCMA.2011.5942376

Filename

5942376