• DocumentCode
    737534
  • Title

    Speaker Verification by Vowel and Nonvowel Like Segmentation

  • Author

    Pradhan, G. ; Prasanna, S.R.M.

  • Author_Institution
    Dept. of Electron. & Electr. Eng., Indian Inst. of Technol. Guwahati, Guwahati, India
  • Volume
    21
  • Issue
    4
  • fYear
    2013
  • fDate
    4/1/2013 12:00:00 AM
  • Firstpage
    854
  • Lastpage
    867
  • Abstract
    This work proposes methods for detecting vowel-like regions (VLRs) and non-vowel-like regions (non-VLRs) using excitation source information. The VLR onset and end points are hypothesized and used in an iterative algorithm for detecting the VLRs. Next, for detection of non-VLRs, the linear prediction (LP) residual samples in the VLRs are attenuated significantly to indirectly emphasize the residual samples in the non-VLRs. The modified LP residual samples excite the time varying all pole filter to reconstruct non-VLRs enhanced speech and used for detecting non-VLRs. The VLRs and non-VLRs are used independently during training and testing of a speaker verification (SV) system to reduce gross level mismatch due to sound units and achieve better compensation of degradation effects by applying different normalization to these two different energy regions. Finally, the scores are combined with higher weight on VLRs, which are more speaker specific. Experiments verify that the proposed approach provides improved performance for clean and degraded speech. On the NIST-2003 speaker recognition database, using VLRs and non-VLRs improves the equal error rate from 6.63% to 6% and from 2.29% to 1.89% for a GMM-UBM based and an i-vector based SV system, respectively.
  • Keywords
    iterative methods; speaker recognition; time-varying filters; GMM-UBM; NIST-2003 speaker recognition database; VLR onset; end points; equal error rate; excitation source information; i-vector based SV system; iterative algorithm; linear prediction residual samples; modified LP residual samples; non-VLR; nonvowel like segmentation; nonvowel-like regions; speaker verification system; time varying all pole filter; vowel like segmentation; vowel-like regions detection; Degradation; Helium; Noise; Robustness; Speech; Testing; Training; VLREP; VLROP; VLRs; degraded condition; non-VLRs; speaker information; speaker verification;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2013.2238529
  • Filename
    6407843