Very large population text-independent speaker identification using transformation enhanced multi-grained models

Author

Chaudhari, Upendra V. ; Navrratil, J. ; Ramaswamy, Ganesh N. ; Maes, Stéphane H.

Author_Institution

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Volume

1

fYear

2001

fDate

2001

Firstpage

461

Abstract

Presents results on speaker identification with a population size of over 10000 speakers. Speaker modeling is accomplished via our transformation enhanced multigrained models. Pursuing two goals, the first is to study the performance of a number of different systems within the modeling framework of multi-grained models. The second is to analyze performance as a function of population size. We show that the most complex models within the framework perform the best and demonstrate that, in approximation, the identification error rate scales linearly with the log of the population size for the described system. Further, we develop a candidate rejection technique based on our analysis of the system performance which indicates a low confidence in the identity chosen

Keywords

Gaussian distribution; feature extraction; hidden Markov models; speaker recognition; candidate rejection technique; identification error rate; population size; speaker modeling; telephone-quality speech; transformation enhanced Gaussian mixture model; transformation enhanced multi-grained models; very large population text-independent speaker identification; Cepstral analysis; Error analysis; Mel frequency cepstral coefficient; Microphones; Performance analysis; Speaker recognition; Speech; System performance; Testing; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on

Conference_Location

Salt Lake City, UT

ISSN

1520-6149

Print_ISBN

0-7803-7041-4

Type

conf

DOI

10.1109/ICASSP.2001.940867

Filename

940867