DocumentCode
178899
Title
Deep neural networks for single channel source separation
Author
Grais, E.M. ; Sen, Mehmet Umut ; Erdogan, H.
Author_Institution
Fac. of Eng. & Natural Sci., Sabanci Univ., Istanbul, Turkey
fYear
2014
fDate
4-9 May 2014
Firstpage
3734
Lastpage
3738
Abstract
In this paper, a novel approach for single channel source separation (SCSS) using a deep neural network (DNN) architecture is introduced. Unlike previous studies in which DNN and other classifiers were used for classifying time-frequency bins to obtain hard masks for each source, we use the DNN to classify estimated source spectra to check for their validity during separation. In the training stage, the training data for the source signals are used to train a DNN. In the separation stage, the trained DNN is utilized to aid in estimation of each source in the mixed signal. Single channel source separation problem is formulated as an energy minimization problem where each source spectra estimate is encouraged to fit the trained DNN model and the mixed signal spectrum is encouraged to be written as a weighted sum of the estimated source spectra. The proposed approach works regardless of the energy scale differences between the source signals in the training and separation stages. Nonnegative matrix factorization (NMF) is used to initialize the DNN estimate for each source. The experimental results show that using DNN initialized by NMF for source separation improves the quality of the separated signal compared with using NMF for source separation.
Keywords
audio signal processing; learning (artificial intelligence); matrix decomposition; minimisation; neural net architecture; signal classification; source separation; DNN estimation; NMF; SCSS; deep neural network architecture; energy minimization problem; mixed signal spectrum; nonnegative matrix factorization; single channel source separation; source spectra classification; source spectra estimation classification; training data; Dictionaries; Hidden Markov models; Source separation; Speech; Speech processing; Training; Training data; Single channel source separation; deep neural network; nonnegative matrix factorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6854299
Filename
6854299
Link To Document