• DocumentCode
    454634
  • Title

    QUANTization for Adapted GMM-Based Speaker Verification

  • Author

    Tseng, Ivy H. ; Verscheure, Olivier ; Turaga, Deepak S. ; Chaudhari, Upendra V.

  • Author_Institution
    Inst. of Signal & Image Process., Southern California Univ., Los Angeles, CA
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    State-of-the-art speaker verification systems are built around the likelihood ratio test, using Gaussian mixture models (GMM) for likelihood functions, a universal background model (UBM) for alternative speaker representation, and a form of Bayesian adaptation to derive speaker models from the UBM. This work tackles optimal quantizer design of the speech cepstral features (MFCCs) for such systems. The problem is posed as the minimization of loss of log-likelihood ratio between the quantized and unquantized speech features. First we show that the conventional mean squared error (MSE) quantizer for the top-scoring UBM Gaussian is optimal under practical assumptions. Then we derive the optimal bit allocation strategy across the dimensions of the feature vectors. Finally we demonstrate the validity of the approach against various quantization and bit allocation schemes by running experiments on the appropriately modified IBM speaker verification system. Experimental results on the HUB4 corpora show negligible impact on verification performance for bit rates as low as less than 1 bit per dimension on average in contrast to 32 bits per dimension in the original system
  • Keywords
    Bayes methods; Gaussian processes; mean square error methods; speaker recognition; Bayesian adaptation; GMM; Gaussian mixture models; MSE; mean squared error; optimal bit allocation; speaker representation; speaker verification; speech cepstral features; universal background model; unquantized speech features; Bit rate; Cepstral analysis; Data mining; Distributed processing; Feature extraction; Mel frequency cepstral coefficient; Quantization; Signal processing; Speech recognition; Telecommunication standards;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1660105
  • Filename
    1660105