مرکز منطقه ای اطلاع رساني علوم و فناوري - Incremental speaker adaptation with minimum error discriminative training for speaker identification

DocumentCode :

312314

Title :

Incremental speaker adaptation with minimum error discriminative training for speaker identification

Author :

Álamo, C. Martín del ; Álvarez, J. ; de la Torre, C. ; Poyatos, F.J. ; Hernández, L.

Author_Institution :

Speech Technol. Group, Telefonica Investigacion y Desarrollo, Madrid, Spain

Volume :

fYear :

1996

fDate :

3-6 Oct 1996

Firstpage :

1760

Abstract :

The minimum classification error (MCE) has been shown to be effective in improving the performance of a speaker identification system. However, there are still problems to solve, such as the variability of the voice characteristics of a particular speaker through time. In this paper, we analyze the degradation of a Gaussian mixture model (GMM) based text-independent speaker identification system when using test data recorded over six months after the training session, and, in an attempt to avoid this degradation, we study the use of supervised adaptation based on maximum a posteriori (MAP) estimation and MCE. These techniques have been shown to provide good results for speaker adaptation in speech recognition. The major result we have obtained is that, by starting with GMM models trained with only speech from session 1, similar identification results can be obtained for all the other sessions using an incremental adaptation using only 2.5 seconds of speech per speaker and session as data for the MCE training adaptation procedure. We have also found that, in our extreme experimental setup, MAP becomes unhelpful when combined with MCE adaptation

Keywords :

Gaussian distribution; errors; maximum likelihood estimation; pattern classification; speaker recognition; 2.5 s; GMM-based text-independent speaker identification system; Gaussian mixture model; incremental speaker adaptation; maximum a posteriori estimation; minimum classification error; minimum error discriminative training; performance; speech recognition; supervised adaptation; system degradation; training adaptation procedure; voice characteristics; Cepstral analysis; Databases; Density functional theory; Hidden Markov models; Linear predictive coding; Loss measurement; Parameter estimation; Speaker recognition; State estimation; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location :

Philadelphia, PA

Print_ISBN :

0-7803-3555-4

Type :

conf

DOI :

10.1109/ICSLP.1996.607969

Filename :

607969

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=312314