مرکز منطقه ای اطلاع رساني علوم و فناوري - Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

DocumentCode :

740090

Title :

Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

Author :

Huang, Po-Sen ; Kim, Minje ; Hasegawa-Johnson, Mark ; Smaragdis, Paris

Author_Institution :

Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA

Volume :

Issue :

fYear :

2015

Firstpage :

2136

Lastpage :

2147

Abstract :

Monaural source separation is important for many real world applications. It is challenging because, with only a single channel of information available, without any constraints, an infinite number of solutions are possible. In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including speech separation, singing voice separation, and speech denoising. The joint optimization of the deep recurrent neural networks with an extra masking layer enforces a reconstruction constraint. Moreover, we explore a discriminative criterion for training neural networks to further enhance the separation performance. We evaluate the proposed system on the TSP, MIR-1K, and TIMIT datasets for speech separation, singing voice separation, and speech denoising tasks, respectively. Our approaches achieve 2.30–4.98 dB SDR gain compared to NMF models in the speech separation task, 2.30–2.48 dB GNSDR gain and 4.32–5.42 dB GSIR gain compared to existing models in the singing voice separation task, and outperform NMF and DNN baselines in the speech denoising task.

Keywords :

Computers; Electronic mail; IEEE transactions; Indexes; Speech; Speech processing; Standards; Deep recurrent neural network (DRNN); discriminative training; monaural source separation; time–frequency masking;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE/ACM Transactions on

Publisher :

ieee

ISSN :

2329-9290

Type :

jour

DOI :

10.1109/TASLP.2015.2468583

Filename :

7194774

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=740090