A comparative study of mixture-Gaussian VQ, ergodic HMMs and left-to-right HMMs for speaker recognition

Author

Zhu, Xiaoyuan ; Millar, Bruce ; Macleod, Iain ; Wagner, Michael ; Chen, Fangxin ; Ran, Shuping

Author_Institution

Australian Nat. Univ., ACT, Australia

fYear

1994

fDate

13-16 Apr 1994

Firstpage

618

Abstract

This paper compares a mixture-Gaussian vector quantisation (VQ) method, ergodic continuous hidden Markov models (CHMMs) and phone-level left-to-right CHMMs for text-independent speaker recognition. These three methods represent a progression of phonetic specificity prior to the generation of probabilities against which speakers are compared. The mixture-Gaussian VQ uses a single distribution for all phones, the ergodic CHMM uses several distributions which have been shown in a previous text-independent speaker recognition study to represent broad phonetic classes, and the phone-based left-to-right CHMM uses many distributions representing the specific phones in the test utterance. Our experiments with speaker recognition on 40 TIMIT speakers show that the recognition rates of the mixture-Gaussian VQ, ergodic CHMMs and phone-based left-to-right CHMMs are 87.5%, 87.5% and 100% respectively

Keywords

hidden Markov models; speech recognition; stochastic processes; vector quantisation; TIMIT speakers; continuous hidden Markov models; distributions; ergodic HMMs; left-to-right HMM; mixture-Gaussian VQ; mixture-Gaussian vector quantisation; phone-level CHMM; recognition rates; test utterance; text-independent speaker recognition; Acoustic testing; Hidden Markov models; Radio access networks; Robustness; Speaker recognition; Speech; Statistics; System testing; Topology; Vector quantization;

fLanguage

English

Publisher

ieee

Conference_Titel

Speech, Image Processing and Neural Networks, 1994. Proceedings, ISSIPNN '94., 1994 International Symposium on

Print_ISBN

0-7803-1865-X

Type

conf

DOI

10.1109/SIPNN.1994.344834

Filename

344834