• DocumentCode
    2800228
  • Title

    Across-phone variability and diagonal term in joint factor analysis for speaker recognition

  • Author

    Kajarekar, Sachin S.

  • Author_Institution
    SRI Int., Menlo Park, CA, USA
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4406
  • Lastpage
    4409
  • Abstract
    We investigate usefulness of across-phone variability for speaker recognition in a joint factor analysis (JFA) framework. We estimate the variability as across-phone covariance within a conversation side averaged over all conversations. Note that it is a part of channel variability in the current JFA framework. We independently estimate feature subspaces representing across-phone, speaker and channel variability and perform speaker recognition experiments by either keeping them or removing them. The results show that the across-phone subspace is more correlated with the speaker subspace. We also perform speaker recognition experiments when combining the subspaces. Results show an improvement when phone and speaker subspaces are combined. This shows that across-phone variability is useful for speaker recognition. Further experiments show that the results are affected by a diagonal term from JFA. In particular, the improvement when combining the speaker and phone subspaces is reduced when the diagonal term is estimated from a universal background model (UBM). This implies that there is an interaction between the variability represented by the diagonal term and the across-phone variability. Overall, the work shows the importance of understanding the diagonal term (with speaker and channel subspaces) for incorporating additional variability into JFA beyond speaker and channel.
  • Keywords
    speaker recognition; across-phone covariance; across-phone variability; channel variability; conversation side; diagonal term; joint factor analysis; phone subspaces; speaker recognition; speaker subspaces; universal background model; Cepstral analysis; Error analysis; Natural languages; Polynomials; Speaker recognition; Speech analysis; Speech recognition; Stacking; Support vector machines; Testing; Speaker recognition; joint factor analysis; language independent speech recognition; phonetic variability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495630
  • Filename
    5495630