• DocumentCode
    867591
  • Title

    A new Kullback-Leibler VAD for speech recognition in noise

  • Author

    Ramírez, Javier ; Segura, José C. ; Benítez, Carmen ; de la Torre, A. ; Rubio, Antonio J.

  • Author_Institution
    Dept. of Electron. y Tecnologia de Computadores, Univ. de Granada, Spain
  • Volume
    11
  • Issue
    2
  • fYear
    2004
  • Firstpage
    266
  • Lastpage
    269
  • Abstract
    This letter shows an innovative voice activity detector (VAD) based on the Kullback-Leibler (KL) divergence measure. The algorithm is evaluated in the context of the recently approved ETSI standard for distributed speech recognition (DSR). The VAD uses long-term information of the noisy speech signal in order to define a more robust decision rule yielding high accuracy. The mel-scaled filter bank log-energies (FBE) are modeled by means of Gaussian distributions, and a symmetric KL divergence is used for the estimation of the distance between speech and noise distributions. The decision rule is formulated in terms of the average subband KL divergence that is compared to a noise-adaptable threshold. An exhaustive analysis using the AURORA databases is conducted in order to assess the performance of the proposed method and to compare it to existing standard VAD methods.
  • Keywords
    Gaussian distribution; Gaussian noise; acoustic noise; channel bank filters; speech recognition; AURORA database; ETSI standard; Gaussian distribution; Kullback-Leibler divergence measure; average subband divergence; distributed speech recognition; innovative voice activity detector; long-term information; mel-scaled filter bank log-energy; noise reduction; noise-adaptable threshold; noisy speech recognition; noisy speech signal; robust decision rule; speech-noise distance distribution; standard voice detector method; symmetric Kullback-Leibler divergence estimation; voice activity detector; Databases; Detectors; Filter bank; Gaussian distribution; Gaussian noise; Noise robustness; Performance analysis; Speech enhancement; Speech recognition; Telecommunication standards;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2003.821762
  • Filename
    1261996