• DocumentCode
    1689210
  • Title

    Robust front-end processing for speaker identification over extremely degraded communication channels

  • Author

    Sadjadi, Seyed Omid ; Hansen, John H. L.

  • Author_Institution
    Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
  • fYear
    2013
  • Firstpage
    7214
  • Lastpage
    7218
  • Abstract
    Effective front-end processing, which often involves feature extraction and speech activity detection (SAD), is essential for robustness in speech systems. In this study, we propose an unsupervised SAD scheme based on four different speech voicing measures which are combined with a perceptual spectral flux feature. Effectiveness of the proposed scheme is evaluated and compared against several commonly adopted unsupervised SAD methods under actual harsh acoustic conditions. As an example application, we also evaluate performance of the proposed SAD in the context of an i-vector based speaker identification (SID) system, where the recently introduced mean Hilbert envelope coefficients (MHEC) are benchmarked against conventional MFCCs. Long and spontaneous conversational audio recordings from DARPA program RATS (Phase-I) are used in our evaluations. Experimental results indicate that the proposed SAD solution is highly effective and provides superior performance compared to other unsupervised SAD techniques considered. In addition, it is shown that MHECs are effective alternatives to MFCCs for SID tasks under severe degraded channel conditions.
  • Keywords
    feature extraction; speaker recognition; speech processing; DARPA program RATS; MFCC; MHEC; SAD solution; SID system; SID tasks; channel conditions; communication channels; feature extraction; harsh acoustic conditions; i-vector based speaker identification system; mean Hilbert envelope coefficients; perceptual spectral flux feature; robust front end processing; speech activity detection; speech systems; speech voicing measures; spontaneous conversational audio recordings; unsupervised SAD methods; unsupervised SAD scheme; unsupervised SAD techniques; Feature extraction; Mel frequency cepstral coefficient; Noise; Noise measurement; Rats; Robustness; Speech; Mean Hilbert Envelope Coefficients (MHEC); speaker identification (SID); spectral flux; speech activity detection (SAD); voicing measures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639063
  • Filename
    6639063