DocumentCode
134223
Title
Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers
Author
Yanhui Tu ; Jun Du ; Yong Xu ; Lirong Dai ; Chin-Hui Lee
Author_Institution
Univ. of Sci. & Technol. of China, Hefei, China
fYear
2014
fDate
12-14 Sept. 2014
Firstpage
250
Lastpage
254
Abstract
In this paper, a novel deep neural network (DNN) architecture is proposed to generate the speech features of both the target speaker and interferer for speech separation. DNN is adopted here to directly model the highly nonlinear relationship between speech features of the mixed signals and the two competing speakers. With the modified output speech features for learning the parameters of the DNN, the generalization capacity to unseen interferers is improved for separating the target speech. Meanwhile, without any prior information from the interferer, the interfering speech can also be separated. Experimental results show that the proposed new DNN enhances the separation performance in terms of different objective measures under the semi-supervised mode where the training data of the target speaker is provided while the unseen interferer in the separation stage is predicted by using multiple interfering speakers mixed with the target speaker in the training stage.
Keywords
neural nets; speaker recognition; DNN; deep neural networks; interfering speakers; mixed signal speech features; output speech features; semisupervised mode; speech separation; target speakers; Hidden Markov models; Interference; Neural networks; Signal to noise ratio; Speech; Speech processing; Training; deep neural networks; semi-supervised mode; single-channel speech separation;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location
Singapore
Type
conf
DOI
10.1109/ISCSLP.2014.6936615
Filename
6936615
Link To Document