• DocumentCode
    1239840
  • Title

    Speaker classification using composite hypothesis testing and list decoding

  • Author

    Roberts, William J.J. ; Ephraim, Yariv ; Sabrin, Howard W.

  • Author_Institution
    Atlantic Coast Technol. Inc., Silver Spring, MD, USA
  • Volume
    13
  • Issue
    2
  • fYear
    2005
  • fDate
    3/1/2005 12:00:00 AM
  • Firstpage
    211
  • Lastpage
    219
  • Abstract
    Speaker classification is seen as a hypothesis testing problem of J simple hypotheses and a composite hypothesis. The simple hypotheses represent target speakers while the composite hypothesis represents nontarget speakers. The simple hypotheses have well-defined distributions that are estimated from training signals. The distribution of the signal under the composite hypothesis is assumed to belong to a given family. The parameter of that distribution is assumed random with a prior distribution that is estimated from a large set of speakers. This formulation converts the problem to that of testing J+1 simple hypotheses. Signals corresponding to target and nontarget speakers are assumed Gaussian mixtures processes. Once the system has been trained, list decoding is applied in which a test signal is associated with a list of possible speakers. The probability that the correct speaker is on the list is maximized for a given average number of incorrect speakers on the list. Results from speaker identification and speaker verification experiments are reported. In speaker identification using a National Institute of Standards and Technology (NIST) database with 174 target speakers, over 77% correct identification was achieved for an average of less than two erroneous speakers on the list. Speaker verification experiments on a similar database yielded results, expressed in terms of the equal-error-rate, of 6.7% and 10.1% using two decision rules.
  • Keywords
    Bayes methods; Gaussian processes; database management systems; decoding; error statistics; maximum likelihood estimation; speaker recognition; Gaussian mixtures process; J simple hypotheses; a prior distribution; composite hypothesis testing hypothesis testing; equal-error-rate; list decoding; nontarget speaker; speaker classification; speaker verification; Databases; Decoding; Hidden Markov models; Maximum likelihood estimation; NIST; Parameter estimation; Random variables; Signal processing; Speaker recognition; System testing; Composite hypothesis; list decoding; speaker recognition;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/TSA.2004.838536
  • Filename
    1395966