DocumentCode
294590
Title
Neural net approaches to speaker verification: comparison with second order statistic measures
Author
Homayounpour, M. Mehdi ; Chollet, Gérard
Author_Institution
URA, CNRS, Paris, France
Volume
1
fYear
1995
fDate
9-12 May 1995
Firstpage
353
Abstract
The non-supervised self organizing map of Kohonen (SOM), the supervised learning vector quantization algorithm (LVQ3), and a method based on second-order statistical measures (SOSM) were adapted, evaluated and compared for speaker verification on 57 speakers of a POLYPHONE-like data base. The SOM and LVQ3 were trained by codebooks with 32 and 256 codes and two statistical measures; one without weighting (SOSM1) and another with weighting (SOSM2) were implemented. As the decision criterion, the equal error rate (EER) and best match decision rule (BMDR) were employed and evaluated. The weighted linear predictive cepstrum coefficients (LPCC) and the ΔLPCC were used jointly as two kinds of spectral speech representations in a single vector as distinctive features. The LVQ3 demonstrates a performance advantage over SOM. This is due to the fact that the LVQ3 allows the long-term fine-tuning of an interested target codebook using speech data from a client and other speakers, whereas the SOM only uses data from the client. The SOSM performs better than the SOM and the LVQ3 for long test utterances, while for short test utterances the LVQ is the best method among the methods studied
Keywords
cepstral analysis; error statistics; learning (artificial intelligence); neural nets; prediction theory; self-organising feature maps; speaker recognition; speech coding; statistical analysis; vector quantisation; LVQ; POLYPHONE-like data base; best match decision rule; codebooks; decision criterion; equal error rate; long test utterances; long-term fine-tuning; nonsupervised self organizing map; performance; second order statistic measures; short test utterances; speaker verification; spectral speech representations; speech data; supervised learning vector quantization algorithm; weighted linear predictive cepstrum coefficients; weighting; Cepstrum; Error analysis; Neural networks; Organizing; Performance evaluation; Speech; Supervised learning; Testing; Vector quantization; Weight measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location
Detroit, MI
ISSN
1520-6149
Print_ISBN
0-7803-2431-5
Type
conf
DOI
10.1109/ICASSP.1995.479594
Filename
479594
Link To Document