• DocumentCode
    1754486
  • Title

    Dual-Microphone Voice Activity Detection Technique Based on Two-Step Power Level Difference Ratio

  • Author

    Jae-Hun Choi ; Joon-Hyuk Chang

  • Author_Institution
    Sch. of Electron. Eng., Hanyang Univ., Seoul, South Korea
  • Volume
    22
  • Issue
    6
  • fYear
    2014
  • fDate
    41791
  • Firstpage
    1069
  • Lastpage
    1081
  • Abstract
    In this paper, we propose a novel dual-microphone voice activity detection (VAD) technique based on the two-step power level difference (PLD) ratio. This technique basically exploits the PLD between the primary microphone and the secondary microphone in a mobile device when the distance between the microphones and the sound source is relatively short. Based on the PLD, we propose the use of the PLD ratio (PLDR) instead of the original PLD to take advantage of the relative difference between the PLD of speech and the PLD of noise. Indeed, the PLDR is obtained by estimating the ratio of the PLD between the input signals and the PLD between the two channel noises during periods without speech. The proposed technique offers a two-step algorithm using the PLDRs including long-term PLDR (LT-PLDR), which characterizes long-term evolution and short-term PLDR (ST-PLDR), which characterizes short-time variation during the first step. LT-PLDR-based and ST-PLDR-based VAD decision are performed using the maximum a posteriori (MAP) probability derived from the model-trust algorithm and combined at the second step to reach a superior VAD decision for both long-term and short-term situations. Extensive experimental results show that the proposed dual-microphone VAD technique outperforms the conventional two-channel VAD method as well as most standardized VAD algorithms.
  • Keywords
    acoustic generators; acoustic radiators; acoustic signal detection; maximum likelihood estimation; microphones; speech recognition; voice communication; LT-PLDR; MAP probability; ST-PLDR; channel noises; dual-microphone VAD technique; dual-microphone voice activity detection technique; long-term PLDR; maximum a posteriori probability; mobile device; model-trust algorithm; noise PLD; primary microphone; secondary microphone; short-term PLDR; sound source; speech PLD; two-channel VAD method; two-step PLDR; two-step power level difference ratio; Estimation; Microphones; Noise; Noise measurement; Smoothing methods; Speech; Speech processing; Dual-microphone; power level difference ratio; two-step; voice activity detection;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2313917
  • Filename
    6803880