DocumentCode
3165465
Title
Intonational speaker verification: A study on parameters and performance under noisy conditions
Author
Siddiq, Sadjad ; Kinnunen, Tomi ; Vainio, Martti ; Werner, Stefan
Author_Institution
Univ. of Eastern Finland, Joensuu, Finland
fYear
2012
fDate
25-30 March 2012
Firstpage
4777
Lastpage
4780
Abstract
Prosody-based speaker verification using fundamental frequency (f0) is considered. Our study consists of two phases. First, we do extensive optimization of parameters to establish a baseline system before dealing with noisy conditions. This includes a study of f0 extractor parameters, choice of features (discrete cosine transform, discrete Fourier transform, Legendre polynomials, linear prediction), f0 track interpolation (none, linear, Hermite), framing parameters and windowing (none, Hamming), f0 representation domain (linear, log), number of transformation coefficients and, finally, use of higher-level delta coefficients. Using the optimized parameters, we then explore the robustness of prosody features under white noise and factory noise degradations. Using a GMM-UBM system on the NIST 2006 SRE corpus, we reach an EER of 28.4 % and 27.6 % for the intonational and MFCC features respectively at -20 dB SNR white noise contamination; fusion of the two yields an EER of 24.38 %.
Keywords
discrete Fourier transforms; discrete cosine transforms; polynomials; speaker recognition; Legendre polynomials; baseline system; delta coefficients; discrete Fourier transform; discrete cosine transform; extractor parameters; factory noise degradation; framing parameters; fundamental frequency; intonational speaker verification; linear prediction; noisy conditions; prosody based speaker verification; prosody features; representation domain; track interpolation; transformation coefficients; white noise contamination; Discrete Fourier transforms; Discrete cosine transforms; Feature extraction; Interpolation; Mel frequency cepstral coefficient; Speaker recognition; Speech; fundamental frequency; prosodic features; speaker recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location
Kyoto
ISSN
1520-6149
Print_ISBN
978-1-4673-0045-2
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2012.6288987
Filename
6288987
Link To Document