DocumentCode
773472
Title
Speaker recognition using hidden Markov models, dynamic time warping and vector quantisation
Author
Yu, K. ; Mason, J. ; Oglesby, J.
Author_Institution
Dept. of Electr. & Electron. Eng., Univ. Coll. of Swansea, UK
Volume
142
Issue
5
fYear
1995
fDate
10/1/1995 12:00:00 AM
Firstpage
313
Lastpage
318
Abstract
The authors evaluate continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisation (VQ) for speaker recognition, emphasising the performance of each model structure across incremental amounts of training data. Text-independent (TI) experiments are performed with VQ and CDHMMs, and text-dependent (TD) experiments are performed with DTW, VQ and CDHMMs. For TI speaker recognition, VQ performs better than an equivalent CDHMM with one training version, but is outperformed by CDHMM when trained with ten training versions. For TD experiments, DTW outperforms VQ and CDHMMs for sparse amounts of training data, but with more data the performance of each model is indistinguishable. The performance of the TD procedures is consistently superior to TI, which is attributed to subdividing the speaker recognition problem into smaller speaker-word problems. It is also shown that there is a large variation in performance across the different digits, and it is concluded that digit zero is the best digit for speaker discrimination
Keywords
hidden Markov models; speaker recognition; vector quantisation; CDHMM; DTW; HMM; VQ; continuous density hidden Markov models; distortion-based vector quantisation; dynamic time warping; speaker discrimination; speaker recognition; speaker-word problems; text-dependent experiments; text-independent experiments; training data;
fLanguage
English
Journal_Title
Vision, Image and Signal Processing, IEE Proceedings -
Publisher
iet
ISSN
1350-245X
Type
jour
DOI
10.1049/ip-vis:19952144
Filename
487791
Link To Document