• DocumentCode
    302085
  • Title

    Cohort selection and word grammar effects for speaker recognition

  • Author

    Colombi, J.M. ; Ruck, D.W. ; Anderson, T.R. ; Rogers, S.K. ; Oxley, M.

  • Author_Institution
    Air Force Inst. of Technol., Wright-Patterson AFB, OH, USA
  • Volume
    1
  • fYear
    1996
  • fDate
    7-10 May 1996
  • Firstpage
    85
  • Abstract
    Automatic speaker recognition systems are maturing and databases have been designed to specifically compare algorithms and results to target error rates. The LDC YOHO speaker verification database was designed to test error rates at the 1% false rejection and 0.1% false acceptance level. This work examines the use of speaker-dependent (SD) monophone models to meet these requirements. By representing each speaker with 22 monophones, both closed-set speaker identification and global-threshold verification was performed. Using four combination lock phrases, speaker identification error rates are obtained at 0.19% for males and 0.31% for females. By defining a test hypothesis, a critical error analysis for speaker verification is developed and new results reported for YOHO. A new Bhattacharyya distance is developed for cohort selection. This method, based on the second order statistics of the enrolment Viterbi log-likelihoods, determines the optimal cohorts and achieves an equal error rate of 0.282%
  • Keywords
    error statistics; grammars; maximum likelihood estimation; speaker recognition; Bhattacharyya distance; LDC YOHO speaker verification database; automatic speaker recognition systems; closed set speaker identification; cohort selection; combination lock phrases; critical error analysis; enrolment Viterbi log-likelihoods; equal error rate; false acceptance level; false rejection; females; global threshold verification; males; optimal cohorts; second order statistics; speaker dependent monophone models; speaker identification error rates; test hypothesis; word grammar effects; Algorithm design and analysis; Databases; Error analysis; Feature extraction; Hidden Markov models; Speaker recognition; Speech recognition; Target recognition; Testing; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
  • Conference_Location
    Atlanta, GA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-3192-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1996.540296
  • Filename
    540296