Title :
Voice activity classification using beamformer-output-ratio
Author :
Tran, Thuy N. ; Cowley, W. ; Pollok, André
Author_Institution :
Inst. for Telecommun. Res., Univ. of South Australia, Adelaide, SA, Australia
fDate :
Jan. 30 2012-Feb. 2 2012
Abstract :
In a conversation between multiple speakers, each person participates in the speech at different times. Therefore the active speakers in each speech segment are unknown. However, identifying the voice activity (VA) of the speakers of interest is required for adaptive beamforming techniques such as minimum variance distortionless response beamforming and the adaptive blocking beamforming (AB). Considering two speakers, this paper addresses a voice activity classification (VAC) problem that focuses on identifying the active speaker(s) in each speech segment. The proposed method is based on a new concept, the beamformer-output-ratio (BOR). This value is calculated from the outputs of two different beamformers steering at two speakers. The first part of the paper introduces the definition of BOR, the VAC method using BOR and simulation results. The simulations are based on real recordings and show a high classification accuracy. In the second part of the paper, the theoretical results of the BOR of the delay-and-sum (DS) beamforming are presented, including BOR formula derived in different environments and its behaviour in relation to parameter errors.
Keywords :
array signal processing; signal classification; speaker recognition; BOR formula; VAC problem; active speaker; adaptive beamforming; adaptive blocking beamforming; beamformer-output-ratio; delay-and-sum beamforming; minimum variance distortionless response beamforming; parameter error; speakers conversation; speech segment; voice activity classification; voice activity identification; Array signal processing; Indexes; Microphones; Signal to noise ratio; Speech; Vectors;
Conference_Titel :
Communications Theory Workshop (AusCTW), 2012 Australian
Conference_Location :
Wellington
Print_ISBN :
978-1-4577-1961-5
DOI :
10.1109/AusCTW.2012.6164913