• DocumentCode
    3429131
  • Title

    Speech-codebook based soft Voice Activity Detection

  • Author

    Heese, Florian ; Niermann, Markus ; Vary, Peter

  • Author_Institution
    Inst. of Commun. Syst. & Data Process., RWTH Aachen Univ., Aachen, Germany
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4335
  • Lastpage
    4339
  • Abstract
    A novel noise-robust soft Voice Activity Detector (VAD) operating in the short-time Fourier domain is presented. A speech energy gain is obtained by frame-wise processing of a noisy speech signal with a speech codebook algorithm. This gain can be used for robust voice detection. A speaker-independent speech codebook, consisting of spectral envelopes, is created in the training process. While applying the algorithm, the codebook is adapted in every frame to the current speaker by combining the harmonic pitch structure of the actual noisy speech frame with the codebook entries. Soft VAD values ranging from zero to one are calculated by post-processing of the speech gain which is obtained using gain shape vector quantization. A binary VAD is carried out by applying a threshold. The proposed method does not rely on noise a-priori knowledge and is robust w.r.t. highly non-stationary noise and adverse SNR conditions. In addition, it is possible to compromise between the detection-rate and the false-alarm-rate by varying a threshold without increasing the total number of mis-detections. Compared to state-of-the-art VAD systems, the proposed method is characterized by better detection-rates at significant lower false-alarm-rates.
  • Keywords
    speech coding; vector quantisation; actual noisy speech frame; gain shape vector quantization; harmonic pitch structure; noisy speech signal; short-time Fourier domain; spectral envelopes; speech energy gain; speech-codebook based soft voice activity detection; Accuracy; Gain; Noise measurement; Signal to noise ratio; Speech; Speech coding; Codebook; Noise robust; Voice activity detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178789
  • Filename
    7178789