Cohort selection and word grammar effects for speaker recognition

Author

Colombi, J.M. ; Ruck, D.W. ; Anderson, T.R. ; Rogers, S.K. ; Oxley, M.

Author_Institution

Air Force Inst. of Technol., Wright-Patterson AFB, OH, USA

Volume

1

fYear

1996

fDate

7-10 May 1996

Firstpage

85

Abstract

Automatic speaker recognition systems are maturing and databases have been designed to specifically compare algorithms and results to target error rates. The LDC YOHO speaker verification database was designed to test error rates at the 1% false rejection and 0.1% false acceptance level. This work examines the use of speaker-dependent (SD) monophone models to meet these requirements. By representing each speaker with 22 monophones, both closed-set speaker identification and global-threshold verification was performed. Using four combination lock phrases, speaker identification error rates are obtained at 0.19% for males and 0.31% for females. By defining a test hypothesis, a critical error analysis for speaker verification is developed and new results reported for YOHO. A new Bhattacharyya distance is developed for cohort selection. This method, based on the second order statistics of the enrolment Viterbi log-likelihoods, determines the optimal cohorts and achieves an equal error rate of 0.282%

Keywords

error statistics; grammars; maximum likelihood estimation; speaker recognition; Bhattacharyya distance; LDC YOHO speaker verification database; automatic speaker recognition systems; closed set speaker identification; cohort selection; combination lock phrases; critical error analysis; enrolment Viterbi log-likelihoods; equal error rate; false acceptance level; false rejection; females; global threshold verification; males; optimal cohorts; second order statistics; speaker dependent monophone models; speaker identification error rates; test hypothesis; word grammar effects; Algorithm design and analysis; Databases; Error analysis; Feature extraction; Hidden Markov models; Speaker recognition; Speech recognition; Target recognition; Testing; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on

Conference_Location

Atlanta, GA

ISSN

1520-6149

Print_ISBN

0-7803-3192-3

Type

conf

DOI

10.1109/ICASSP.1996.540296

Filename

540296