DocumentCode
105845
Title
Location Feature Integration for Clustering-Based Speech Separation in Distributed Microphone Arrays
Author
Souden, Mehrez ; Kinoshita, Keizo ; Delcroix, Marc ; Nakatani, Takeshi
Author_Institution
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Volume
22
Issue
2
fYear
2014
fDate
Feb. 2014
Firstpage
354
Lastpage
367
Abstract
In distributed microphone arrays (DMAs) the source location information can be defined at the intra and inter-node levels. Indeed, while the first type of information results from the diversity of acoustic channels recorded by microphones embedded in the same node, the second is attributed to the differences between the acoustic channels observed by spatially distributed nodes. Both cues are very useful in DMA processing, and the aim of this paper is to utilize both of them to cluster and separate multiple competing speech signals. To capture the intra-node information, we employ the normalized recording vector, while at the inter-node level, we consider different features including the energy level differences with and without the phase differences between nodes. We model the intra-node information using the Watson mixture model (WMM), and propose using the Gamma mixture model (GaMM), Dirichlet mixture model (DMM), and WMM to model different inter-node location features. Furthermore, we propose several integrations of the intra-node and inter-node feature contributions to cluster speech recordings using the expectation maximization algorithm. Finally, simulation results are provided to demonstrate the performance of all ensuing methods.
Keywords
blind source separation; expectation-maximisation algorithm; microphone arrays; mixture models; Dirichlet mixture model; Gamma mixture model; Watson mixture model; acoustic channels; clustering based speech separation; distributed microphone arrays; energy level differences; expectation maximization algorithm; location feature integration; normalized recording vector; phase differences; source location information; spatially distributed nodes; speech recordings; speech signals; Acoustics; Clustering algorithms; Microphone arrays; Speech; Speech processing; Vectors; Blind source separation; decision fusion; distributed microphone array processing; location-based speech clustering; speech enhancement;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2013.2292308
Filename
6672017
Link To Document