Spectro-temporal Gabor features for speaker recognition

Author

Lei, Howard ; Meyer, Bernd T. ; Mirghafori, Nikki

Author_Institution

Int. Comput. Sci. Inst., Berkeley, CA, USA

fYear

2012

fDate

25-30 March 2012

Firstpage

4241

Lastpage

4244

Abstract

In this work, we have investigated the performance of 2D Gabor features (known as spectro-temporal features) for speaker recognition. Gabor features have been used mainly for automatic speech recognition (ASR), where they have yielded improvements. We explored different Gabor feature implementations, along with different speaker recognition approaches, on ROSSI [1] and NIST SRE08 databases. Using the noisy ROSSI database, the Gabor features performed as well as the MFCC features standalone, and score-level combination of Gabor and MFCC features resulted in an 8% relative EER improvement over MFCC features standalone. These results demonstrated the value of both spectral and temporal information for feature extraction, and the complementarity of Gabor features to MFCC features.

Keywords

Gabor filters; feature extraction; speech recognition; 2D Gabor features; MFCC features standalone; NIST SRE08 database; automatic speech recognition; noisy ROSSI database; score-level combination; spectro-temporal Gabor features; Databases; Feature extraction; Frequency modulation; Mel frequency cepstral coefficient; NIST; Speaker recognition; Training; Gabor features; ROSSI database; Speaker recognition; spectral and temporal modulation;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location

Kyoto

ISSN

1520-6149

Print_ISBN

978-1-4673-0045-2

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2012.6288855

Filename

6288855