• DocumentCode
    2701292
  • Title

    Word-Conditioned Phone N-Grams for Speaker Recognition

  • Author

    Lei, Haozhen ; Mirghafori, N.

  • Author_Institution
    Int. Comput. Sci. Inst., Berkeley, CA, USA
  • Volume
    4
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Abstract
    We extend the state-of-the-art by applying word-conditioning to constrain phone N-gram features used in speaker recognition. Feature-level combination of 52 word unigrams constraining phone N-grams of order 1, 2, and 3 proved to be the best approach. Our system achieves 18% and 27% improvements compared to a non word-conditioned phone N-grams system on SRE05 and SRE06, respectively. Furthermore, the system achieves 18% and 37% improvements compared to the non word-conditioned phone N-grams system when each system is combined with a GMM-based system on SRE05 and SRE06, suggesting that the word-conditioned features are more complementary. On both corpora, this approach achieves a 4.7% EER standalone, and a 3.3% EER in combination with the non word-conditioned phone N-grams and GMM-based systems. Note that the word-conditioning approach utilizes only 43% of SRE05 data.
  • Keywords
    Gaussian processes; speaker recognition; speech processing; GMM-based system; speaker recognition; word unigrams; word-conditioned phone N-grams; word-conditioning approach; Cepstral analysis; Computer science; Detectors; Feature extraction; Hidden Markov models; Humans; Loudspeakers; Speaker recognition; Speech recognition; Testing; Speaker-recognition; high-level features; phone N-grams; word-conditioning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
  • Conference_Location
    Honolulu, HI
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0727-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2007.366897
  • Filename
    4218085