• DocumentCode
    2134950
  • Title

    Duration weighted Gaussian Mixture Model supervector modeling for robust speaker recognition

  • Author

    Zhe Ji ; Wei Hou ; Xin Jin ; Zhi-Yi Li

  • Author_Institution
    Telecom Network Security Div., CNCERT/CC, Beijing, China
  • fYear
    2013
  • fDate
    23-25 July 2013
  • Firstpage
    238
  • Lastpage
    241
  • Abstract
    To make the supervector modeling of speech utterance more effective and accurate, this paper proposes a new duration weighted Gaussian Mixture Model (GMM) supervector modeling method for robust speaker recognition. At the beginning, this method adapts the acoustic features of speech utterance to GMM from a common basic Universal Background Model (UBM) with Maximum A Posterior (MAP) criterion and then models GMM supervector by bounding the Kullback-Leibler (KL) divergence measure. In addition, a duration weight supervector is modeled for using duration information of speech utterances. Furthermore, this paper presents a method of how to effectively apply them together during training and classification. Experimental results on American National Institute of Standards and Technology Speaker Recognition Evaluation (NIST SRE) 2008 dataset demonstrate that the proposed method outperforms the traditional GMM supervector modeling with relative 16% and 10% improvements of Equal Error Rate (EER) and Minimum Detection Cost Function (MinDCF), respectively.
  • Keywords
    Gaussian processes; maximum likelihood estimation; mixture models; speaker recognition; American national institute of standards and technology speaker recognition evaluation; GMM; KL; Kullback- Leibler divergence measure; MAP; MinDCF; NISI SRE; UBM; duration weighted Gaussian mixture model supervector modeling; equal error rate; maximum a posterior criterion; minimum detection cost function; robust speaker recognition; speech utterance; universal background model; Acoustics; Adaptation models; Robustness; Speaker recognition; Speech; Support vector machines; Training; duration weighted; robust speaker recognition; supervector modeling; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Computation (ICNC), 2013 Ninth International Conference on
  • Conference_Location
    Shenyang
  • Type

    conf

  • DOI
    10.1109/ICNC.2013.6817977
  • Filename
    6817977