Neural net approaches to speaker verification: comparison with second order statistic measures

Author

Homayounpour, M. Mehdi ; Chollet, Gérard

Author_Institution

URA, CNRS, Paris, France

Volume

1

fYear

1995

fDate

9-12 May 1995

Firstpage

353

Abstract

The non-supervised self organizing map of Kohonen (SOM), the supervised learning vector quantization algorithm (LVQ3), and a method based on second-order statistical measures (SOSM) were adapted, evaluated and compared for speaker verification on 57 speakers of a POLYPHONE-like data base. The SOM and LVQ3 were trained by codebooks with 32 and 256 codes and two statistical measures; one without weighting (SOSM1) and another with weighting (SOSM2) were implemented. As the decision criterion, the equal error rate (EER) and best match decision rule (BMDR) were employed and evaluated. The weighted linear predictive cepstrum coefficients (LPCC) and the ΔLPCC were used jointly as two kinds of spectral speech representations in a single vector as distinctive features. The LVQ3 demonstrates a performance advantage over SOM. This is due to the fact that the LVQ3 allows the long-term fine-tuning of an interested target codebook using speech data from a client and other speakers, whereas the SOM only uses data from the client. The SOSM performs better than the SOM and the LVQ3 for long test utterances, while for short test utterances the LVQ is the best method among the methods studied

Keywords

cepstral analysis; error statistics; learning (artificial intelligence); neural nets; prediction theory; self-organising feature maps; speaker recognition; speech coding; statistical analysis; vector quantisation; LVQ; POLYPHONE-like data base; best match decision rule; codebooks; decision criterion; equal error rate; long test utterances; long-term fine-tuning; nonsupervised self organizing map; performance; second order statistic measures; short test utterances; speaker verification; spectral speech representations; speech data; supervised learning vector quantization algorithm; weighted linear predictive cepstrum coefficients; weighting; Cepstrum; Error analysis; Neural networks; Organizing; Performance evaluation; Speech; Supervised learning; Testing; Vector quantization; Weight measurement;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on

Conference_Location

Detroit, MI

ISSN

1520-6149

Print_ISBN

0-7803-2431-5

Type

conf

DOI

10.1109/ICASSP.1995.479594

Filename

479594