Effective use of DCTS for contextualizing features for speaker recognition

Author

McLaren, Moray ; Scheffer, Nicolas ; Ferrer, Luciana ; Yun Lei

Author_Institution

Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA

fYear

2014

fDate

4-9 May 2014

Firstpage

4027

Lastpage

4031

Abstract

This article proposes a new approach for contextualizing features for speaker recognition through the discrete cosine transform (DCT). Specifically, we apply a 2D-DCT transform on the Mel filterbank outputs to replace the common Mel frequency cepstral coefficients (MFCCs) appended by deltas and double deltas. A thorough comparison of algorithms for delta computation and DCT-based contextualization for speaker recognition is provided and the effect of varying the size of analysis window in each case is considered. Selection of 2D-DCT coefficients using a zig-zag approach permits definition of an arbitrary feature dimension using the most energized coefficients. We show that 60 coefficients computed using our approach outperforms the standard MFCCs appended with double deltas by up to 25% relative on the NIST 2012 speaker recognition evaluation (SRE) corpus in both Cprimary and equal error rate (EER) while additional coefficients increase system robustness to noise.

Keywords

channel bank filters; discrete cosine transforms; speaker recognition; 2D-DCT coefficient selection; 2D-DCT transform; DCT-based contextualization; EER; MFCCs; Mel filter bank outputs; Mel frequency cepstral coefficients; NIST 2012 speaker recognition evaluation corpus; SRE; analysis window size; arbitrary feature dimension; contextualizing features; discrete cosine transform; double deltas; equal error rate; most energized coefficients; speaker recognition; zig-zag approach; Discrete cosine transforms; Feature extraction; NIST; Noise measurement; Speaker recognition; Speech; Speech recognition; 2D-DCT; Contextualization; Deltas; Filterbank Energies; Speaker Recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854358

Filename

6854358