مرکز منطقه ای اطلاع رساني علوم و فناوري - Speech spectrogram based model adaptation for speaker identification

DocumentCode :

2043608

Title :

Speech spectrogram based model adaptation for speaker identification

Author :

Gurbuz, Sabri ; Gowdy, John N. ; Tufekci, Zekeriyu

Author_Institution :

Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA

fYear :

2000

fDate :

2000

Firstpage :

110

Lastpage :

115

Abstract :

Speech signal feature extraction is a challenging research area with great significance to the speaker identification and speech recognition communities. We propose a novel speech spectrogram based spectral modal adaptation algorithm. This system is based on dynamic thresholding of speech spectrograms for text-dependent speaker identification. For a given utterance from a target speaker we aim to find the target speaker among a number of speakers who exist in the system. Conceptually, this algorithm attempts to increase the spectral similarity for the target speaker while increasing the spectral dissimilarity for the non-target speaker who is a member of the enrolment set. Therefore, it removes aging and intersession-dependent spectral variation in the utterance while preserving the speaker inherent spectral features. The hidden Markov model (HMM) parameters representing each listed speaker in the system are adapted for each identification event. The results obtained using speech signals from both the Noisex database and from recordings in the laboratory environment seem promising and demonstrate the robustness of the algorithm for aging and session-dependent utterances. Additionally, we have evaluated the adapted and the non-adapted models with data recorded two months after the initial enrollment. The adaptation seems to improve the performance of the system for the aged data from 84% to 91%

Keywords :

adaptive signal processing; feature extraction; hidden Markov models; speaker recognition; spectral analysis; HMM parameters; Noisex database; adapted models; aged data; aging; algorithm robustness; dynamic thresholding; enrolment set; hidden Markov model; laboratory environment recordings; nonadapted models; nontarget speaker; session-dependent utterances; speaker inherent spectral features; spectral modal adaptation algorithm; spectral similarity; speech recognition; speech signal feature extraction; speech signals; speech spectrogram based model adaptation; system performance; target speaker; text-dependent speaker identification; Adaptation model; Aging; Feature extraction; Hidden Markov models; Noise robustness; Signal processing; Spatial databases; Spectrogram; Speech recognition; Working environment noise;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Southeastcon 2000. Proceedings of the IEEE

Conference_Location :

Nashville, TN

Print_ISBN :

0-7803-6312-4

Type :

conf

DOI :

10.1109/SECON.2000.845443

Filename :

845443

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2043608