DocumentCode :
2043608
Title :
Speech spectrogram based model adaptation for speaker identification
Author :
Gurbuz, Sabri ; Gowdy, John N. ; Tufekci, Zekeriyu
Author_Institution :
Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA
fYear :
2000
fDate :
2000
Firstpage :
110
Lastpage :
115
Abstract :
Speech signal feature extraction is a challenging research area with great significance to the speaker identification and speech recognition communities. We propose a novel speech spectrogram based spectral modal adaptation algorithm. This system is based on dynamic thresholding of speech spectrograms for text-dependent speaker identification. For a given utterance from a target speaker we aim to find the target speaker among a number of speakers who exist in the system. Conceptually, this algorithm attempts to increase the spectral similarity for the target speaker while increasing the spectral dissimilarity for the non-target speaker who is a member of the enrolment set. Therefore, it removes aging and intersession-dependent spectral variation in the utterance while preserving the speaker inherent spectral features. The hidden Markov model (HMM) parameters representing each listed speaker in the system are adapted for each identification event. The results obtained using speech signals from both the Noisex database and from recordings in the laboratory environment seem promising and demonstrate the robustness of the algorithm for aging and session-dependent utterances. Additionally, we have evaluated the adapted and the non-adapted models with data recorded two months after the initial enrollment. The adaptation seems to improve the performance of the system for the aged data from 84% to 91%
Keywords :
adaptive signal processing; feature extraction; hidden Markov models; speaker recognition; spectral analysis; HMM parameters; Noisex database; adapted models; aged data; aging; algorithm robustness; dynamic thresholding; enrolment set; hidden Markov model; laboratory environment recordings; nonadapted models; nontarget speaker; session-dependent utterances; speaker inherent spectral features; spectral modal adaptation algorithm; spectral similarity; speech recognition; speech signal feature extraction; speech signals; speech spectrogram based model adaptation; system performance; target speaker; text-dependent speaker identification; Adaptation model; Aging; Feature extraction; Hidden Markov models; Noise robustness; Signal processing; Spatial databases; Spectrogram; Speech recognition; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Southeastcon 2000. Proceedings of the IEEE
Conference_Location :
Nashville, TN
Print_ISBN :
0-7803-6312-4
Type :
conf
DOI :
10.1109/SECON.2000.845443
Filename :
845443
Link To Document :
بازگشت