Title :
Sparse Overcomplete Decomposition for Single Channel Speaker Separation
Author :
Shashanka, M.V.S. ; Raj, Bhiksha ; Smaragdis, Paris
Author_Institution :
Hearing Res. Center, Boston Univ., MA, USA
Abstract :
We present an algorithm for separating multiple speakers from a mixed single channel recording. The algorithm is based on a model proposed by Raj and Smaragdis (2005). The idea is to extract certain characteristic spectra-temporal basis functions from training data for individual speakers and decompose the mixed signals as linear combinations of these learned bases. In other words, their model extracts a compact code of basis functions that can explain the space spanned by spectral vectors of a speaker. In our model, we generate a sparse-distributed code where we have more basis functions than the dimensionality of the space. We propose a probabilistic framework to achieve sparsity. Experiments show that the resulting sparse code better captures the structure in data and hence leads to better separation.
Keywords :
source separation; speech processing; characteristic spectra-temporal basis functions; mixed single channel recording; single channel speaker separation; sparse overcomplete decomposition; sparse-distributed code; spectral vectors; Auditory system; Data mining; Entropy; Equations; Frequency; Graphical models; Random processes; Speech enhancement; Training data; Vectors; MAP estimation; Minimum entropy methods; Separation; Speech enhancement;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
DOI :
10.1109/ICASSP.2007.366317