• DocumentCode
    3678654
  • Title

    Offline Voice Activity Detector using speech supergaussianity

  • Author

    Ivan J. Tashev

  • Author_Institution
    Microsoft Research Labs, One Microsoft Way, Redmond, WA 98051, USA
  • fYear
    2015
  • Firstpage
    214
  • Lastpage
    219
  • Abstract
    Voice Activity Detectors (VAD) play important role in audio processing algorithms. Most of the algorithms are designed to be causal, i.e. to work in real time using only current and past audio samples. Off-line processing, when we have access to the entire voice utterance, allows using different type of approaches for increased precision. In this paper we propose an algorithm for off-line VAD based on the different probability density functions (PDFs) of the speech and noise. While a Gaussian distribution is a very good model for noise, the speech PDF is peakier. The proposed VAD algorithm works in frequency domain and estimates the speech signal presence probability for each frequency bin in each audio frame, the speech presence probability for each frame and also provides a binary decision per bin and frame. Provides improved precision compared to the streaming real-time VAD algorithms.
  • Keywords
    "Speech recognition","Hidden Markov models","Speech","Algorithm design and analysis","Detectors","Analytical models","Histograms"
  • Publisher
    ieee
  • Conference_Titel
    Information Theory and Applications Workshop (ITA), 2015
  • Type

    conf

  • DOI
    10.1109/ITA.2015.7308991
  • Filename
    7308991