Multi-channel source separation by beamforming trained with factorial HMMs

Author

Reyes-Gomez, Manuel J. ; Bhiksha, R. ; Ellis, Daniel P W

Author_Institution

Dept. of Electr. Eng., Columbia Univ., New York, NY, USA

fYear

2003

fDate

19-22 Oct. 2003

Firstpage

13

Lastpage

16

Abstract

Speaker separation has conventionally been treated as a problem of blind source separation (BSS). This approach does not utilize any knowledge of the statistical characteristics of the signals to be separated, relying mainly on the independence between the various signals to separate them. Maximum-likelihood techniques, on the other hand, utilize knowledge of the a priori probability distributions of the signals from the speakers, in order to effect separation. Previously (Reyes-Gomez, M.J. et al., Proc. ICASSP, 2003), we presented a maximum-likelihood speaker separation technique that utilizes detailed statistical information about the signals to be separated, represented in the form of hidden Markov models (HMMs), to estimate the parameters of a filter-and-sum processor for signal separation. We show that the filters that are estimated for a particular utterance by a speaker generalize well to other utterances by the same speaker, provided the location of the various speakers remains constant. Thus, filters that have been estimated using a "training" utterance of a known transcript can be used to separate all future signals by the speaker from mixtures of speech signals in an unsupervised manner. On the other hand, the filters are ineffective for other speakers, even at the same locations, indicating that they capture the spatio-frequency characteristics of the speaker.

Keywords

array signal processing; audio signal processing; filtering theory; hidden Markov models; maximum likelihood estimation; source separation; statistical analysis; BSS; HMM; a priori probability distributions; beamforming; blind source separation; filter-and-sum processor; hidden Markov models; maximum-likelihood techniques; multi-channel source separation; multiple microphones; parameter estimation; speaker separation; statistical characteristics; training utterance; Array signal processing; Blind source separation; Filters; Hidden Markov models; Information filtering; Maximum likelihood estimation; Parameter estimation; Probability distribution; Signal processing; Source separation;

fLanguage

English

Publisher

ieee

Conference_Titel

Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on.

Print_ISBN

0-7803-7850-4

Type

conf

DOI

10.1109/ASPAA.2003.1285797

Filename

1285797