• DocumentCode
    1290970
  • Title

    Front-End Factor Analysis for Speaker Verification

  • Author

    Dehak, Najim ; Kenny, Patrick ; Dehak, Réda ; Dumouchel, Pierre ; Ouellet, Pierre

  • Author_Institution
    Comput. Sci. & Artificial Intell. Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA
  • Volume
    19
  • Issue
    4
  • fYear
    2011
  • fDate
    5/1/2011 12:00:00 AM
  • Firstpage
    788
  • Lastpage
    798
  • Abstract
    This paper presents an extension of our previous work which proposes a new speaker representation for speaker verification. In this modeling, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis. This space is named the total variability space because it models both speaker and channel variabilities. Two speaker verification systems are proposed which use this new representation. The first system is a support vector machine-based system that uses the cosine kernel to estimate the similarity between the input data. The second system directly uses the cosine similarity as the final decision score. We tested three channel compensation techniques in the total variability space, which are within-class covariance normalization (WCCN), linear discriminate analysis (LDA), and nuisance attribute projection (NAP). We found that the best results are obtained when LDA is followed by WCCN. We achieved an equal error rate (EER) of 1.12% and MinDCF of 0.0094 using the cosine distance scoring on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation dataset. We also obtained 4% absolute EER improvement for both-gender trials on the 10 s-10 s condition compared to the classical joint factor analysis scoring.
  • Keywords
    speaker recognition; support vector machines; channel compensation technique; channel dependent space; cosine kernel; decision score; front end factor analysis; linear discriminate analysis; low dimensional speaker; nuisance attribute projection; similarity estimation; speaker recognition evaluation; speaker representation; speaker verification; support vector machine; variability space; within class covariance normalization; Cosine distance scoring; joint factor analysis (JFA); support vector machines (SVMs); total variability space;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2010.2064307
  • Filename
    5545402