DocumentCode
2287285
Title
A comparative study of mixture-Gaussian VQ, ergodic HMMs and left-to-right HMMs for speaker recognition
Author
Zhu, Xiaoyuan ; Millar, Bruce ; Macleod, Iain ; Wagner, Michael ; Chen, Fangxin ; Ran, Shuping
Author_Institution
Australian Nat. Univ., ACT, Australia
fYear
1994
fDate
13-16 Apr 1994
Firstpage
618
Abstract
This paper compares a mixture-Gaussian vector quantisation (VQ) method, ergodic continuous hidden Markov models (CHMMs) and phone-level left-to-right CHMMs for text-independent speaker recognition. These three methods represent a progression of phonetic specificity prior to the generation of probabilities against which speakers are compared. The mixture-Gaussian VQ uses a single distribution for all phones, the ergodic CHMM uses several distributions which have been shown in a previous text-independent speaker recognition study to represent broad phonetic classes, and the phone-based left-to-right CHMM uses many distributions representing the specific phones in the test utterance. Our experiments with speaker recognition on 40 TIMIT speakers show that the recognition rates of the mixture-Gaussian VQ, ergodic CHMMs and phone-based left-to-right CHMMs are 87.5%, 87.5% and 100% respectively
Keywords
hidden Markov models; speech recognition; stochastic processes; vector quantisation; TIMIT speakers; continuous hidden Markov models; distributions; ergodic HMMs; left-to-right HMM; mixture-Gaussian VQ; mixture-Gaussian vector quantisation; phone-level CHMM; recognition rates; test utterance; text-independent speaker recognition; Acoustic testing; Hidden Markov models; Radio access networks; Robustness; Speaker recognition; Speech; Statistics; System testing; Topology; Vector quantization;
fLanguage
English
Publisher
ieee
Conference_Titel
Speech, Image Processing and Neural Networks, 1994. Proceedings, ISSIPNN '94., 1994 International Symposium on
Print_ISBN
0-7803-1865-X
Type
conf
DOI
10.1109/SIPNN.1994.344834
Filename
344834
Link To Document