• DocumentCode
    45770
  • Title

    An Investigation into Back-end Advancements for Speaker Recognition in Multi-Session and Noisy Enrollment Scenarios

  • Author

    Gang Liu ; Hansen, John H. L.

  • Author_Institution
    Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
  • Volume
    22
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    1978
  • Lastpage
    1992
  • Abstract
    This study aims to explore the case of robust speaker recognition with multi-session enrollments and noise, with an emphasis on optimal organization and utilization of speaker information presented in the enrollment and development data. This study has two core objectives. First, we investigate more robust back-ends to address noisy multi-session enrollment data for speaker recognition. This task is achieved by proposing novel back-end algorithms. Second, we construct a highly discriminative speaker verification framework. This task is achieved through intrinsic and extrinsic back-end algorithm modification, resulting in complementary sub-systems. Evaluation of the proposed framework is performed on the NIST SRE2012 corpus. Results not only confirm individual sub-system advancements over an established baseline, the final grand fusion solution also represents a comprehensive overall advancement for the NIST SRE2012 core tasks. Compared with state-of-the-art SID systems on the NIST SRE2012, the novel parts of this study are: 1) exploring a more diverse set of solutions for low-dimensional i-Vector based modeling; and 2) diversifying the information configuration before modeling. All these two parts work together, resulting in very competitive performance with reasonable computational cost.
  • Keywords
    acoustic noise; speaker recognition; NIST SRE2012 corpus; SID systems; back-end advancements; discriminative speaker verification framework; extrinsic back-end algorithm; grand fusion solution; information configuration diversifying; intrinsic algorithm modification; low-dimensional i-Vector based modeling; multisession enrollments scenarios; noisy enrollment scenario; robust speaker recognition; Computational modeling; Covariance matrices; Noise measurement; Speaker recognition; Speech; Speech processing; Support vector machines; Classification algorithms; GCDS; PLDA; speaker recognition; universal background support;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2352154
  • Filename
    6883142