• DocumentCode
    2696634
  • Title

    Improvements in MLLR-Transform-based Speaker Recognition

  • Author

    Stolcke, Andreas ; Ferrer, Luciana ; Kajarekar, Sachin

  • Author_Institution
    Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA
  • fYear
    2006
  • fDate
    28-30 June 2006
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    We previously proposed the use of MLLR transforms derived from a speech recognition system as speaker features in a speaker verification system. In this paper we report recent improvements to this approach. First, we noticed a fundamental problem in our previous implementation that stemmed from a mismatch between male and female recognition models, and the model transforms they produce. Although it affects only a small percentage of verification trials (those in which the gender detector commits errors), this mismatch has a large effect on average system performance. We solve this problem by consistently using only one recognition model (either male or female) in computing speaker adaptation transforms regardless of estimated speaker gender. A further accuracy boost is obtained by combining feature vectors derived from male and female vectors into one larger feature vector. Using 1-conversation-side training, the final system has about 27% lower decision cost than a state-of-the-art ccpstral GMM speaker system, and 53% lower decision cost when trained on 8 conversation sides per speaker
  • Keywords
    Gaussian processes; maximum likelihood estimation; regression analysis; speaker recognition; training; transforms; 1-conversation-side training; MLLR transform; average system performance; speaker verification system; speech recognition system; state-of-art cepstral GMM speaker system; Cepstral analysis; Costs; Lattices; Maximum likelihood decoding; Maximum likelihood linear regression; Mel frequency cepstral coefficient; NIST; Predictive models; Speaker recognition; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Speaker and Language Recognition Workshop, 2006. IEEE Odyssey 2006: The
  • Conference_Location
    San Juan
  • Print_ISBN
    1-424400471-1
  • Electronic_ISBN
    1-4244-0472-X
  • Type

    conf

  • DOI
    10.1109/ODYSSEY.2006.248089
  • Filename
    4013506