DocumentCode :
2183299
Title :
Localization based stereo speech separation using deep networks
Author :
Yu, Yang ; Wang, Wenwu ; Luo, Jian ; Feng, Pengming
Author_Institution :
School of Marine Science and Technology, Northwestern Polytechnical University, Xi´an, China, 710072
fYear :
2015
fDate :
21-24 July 2015
Firstpage :
153
Lastpage :
157
Abstract :
Time-frequency (T-F) masking is an effective method for stereo speech source separation. However, reliable estimation of the T-F mask from sound mixtures is a challenging task, especially when room reverberations are present in the mixtures. In this paper, we proposed a new stereo speech separation system where deep networks are used to generate soft T-F mask for separation. More specifically, the deep network, which is composed of two sparse autoencoders and a softmax classifier, is used to estimate the orientations of the target and interferers at each T-F unit, based on low-level features, such as mixing vector (MV), interaural level and phase difference (IPD/ILD). The deep network is trained by a greedy layer-wise method using a dataset that was generated by convolving room impulse responses (RIRs) with clean speech signals positioned in different angles with respect to the sensors. With the trained deep networks, the probability that each T-F unit belongs to the target or interferer can be estimated based on the localization cues for generating the soft mask. Experiments based on real binaural RIRs and TIMIT dataset are provided to show the performance of the proposed system for reverberant speech mixtures, as compared with a model based T-F masking technique proposed recently.
Keywords :
Feature extraction; Neural networks; Reverberation; Source separation; Speech; Speech processing; Training; Deep learning; Deep networks; Soft mask; Source separation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Signal Processing (DSP), 2015 IEEE International Conference on
Conference_Location :
Singapore, Singapore
Type :
conf
DOI :
10.1109/ICDSP.2015.7251849
Filename :
7251849
Link To Document :
بازگشت