• DocumentCode
    3558772
  • Title

    Jointly Gaussian PDF-Based Likelihood Ratio Test for Voice Activity Detection

  • Author

    G?³rriz, Juan Manuel ; Ramirez, Javier ; Lang, Elmar W. ; Puntonet, Carlos G.

  • Author_Institution
    Dept. of Signal Theor., Univ. of Granada, Granada
  • Volume
    16
  • Issue
    8
  • fYear
    2008
  • Firstpage
    1565
  • Lastpage
    1578
  • Abstract
    This paper presents a novel voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems in real-time applications. The algorithm is based on a generalized complex Gaussian (GCG) observation model and defines an optimal likelihood ratio test (LRT) involving multiple and correlated observations (MCO) based on jointly Gaussian probability distribution functions (jGpdf). An extensive analysis of the proposed methodology for a low dimensional observation model demonstrates 1) the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased, and 2) the tradeoff between the number of observations and the detection performance. The proposed strategy is also compared to different VAD methods including the G.729, AMR, and AFE standards, as well as other recently reported algorithms showing a sustained advantage in speech/nonspeech detection accuracy and speech recognition performance.
  • Keywords
    Gaussian distribution; maximum likelihood estimation; object detection; speech recognition; generalized complex Gaussian observation model; jointly Gaussian probability distribution functions; multiple and correlated observations; optimal likelihood ratio test; real-time applications; speech detection; speech recognition systems; voice activity detection; voice activity detector; Detectors; Hidden Markov models; Noise reduction; Probability distribution; Robustness; Speech enhancement; Speech processing; Speech recognition; Testing; Working environment noise; Generalized complex Gaussian (GCG) probability distribution function; robust speech recognition; voice activity detection (VAD);
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2008.2004293
  • Filename
    4648927