Title :
Sound source separation based on non-negative tensor factorization incorporating spatial cue as prior knowledge
Author :
Mitsufuji, Yuki ; Roebel, A.
Author_Institution :
Sony Corp., Tokyo, Japan
Abstract :
This paper concerns a new method of source separation that uses a spatial cue given by a user or from accompanying images to extract a target sound. The algorithm is based on non-negative tensor factorization (NTF), which decomposes multichannel spectrograms into three matrices. The components of one of the three matrices represent spatial information and are associated with the spatial cue, thus indicating which bins of the spectrogram should be given preference. When a spatial cue is available, this method has a great advantage over conventional PARAFAC-NTF in terms of both computational costs and separation quality, as measured by evaluation metrics such as SDR, SIR and SAR.
Keywords :
audio signal processing; matrix decomposition; source separation; tensors; PARAFAC-NTF; SAR; SDR; SIR; audio signal processing; computational costs; evaluation metrics; multichannel spectrograms; nonnegative tensor factorization; rics; separation quality; sound source separation; source separation method; spatial cue; spatial information; target sound extraction; Cost function; Histograms; Signal processing algorithms; Source separation; Spectrogram; Tensile stress; Training; Audio source separation; Nonnegative Tensor Factorization; Signal reconstruction; Sparse representation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6637611