Title :
Convolutional maxout neural networks for speech separation
Author :
Like Hui;Meng Cai;Cong Guo;Liang He;Wei-Qiang Zhang;Jia Liu
Author_Institution :
Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Abstract :
Speech separation based on deep neural networks (DNNs) has been widely studied recently, and has achieved considerable success. However, previous studies are mostly based on fully-connected neural networks. In order to capture the local information of speech signals, we propose to use convolutional maxout neural networks (CMNNs) to separate speech and noise by estimating the ideal ratio mask of the time-frequency units. In our work the proposed CMNN is applied in the frequency domain. By using local filtering and max-pooling, convolutional neural networks can model the local structure of speech signals. Instead of sigmoid function, maxout is selected to address the saturation problem. In addition, dropout is integrated into the network to get better generalization ability. The proposed system outperforms a traditional DNN-based system in both objective speech quality and intelligibility.
Keywords :
"Speech","Convolution","Training","Signal to noise ratio","Frequency-domain analysis","Feature extraction","Neural networks"
Conference_Titel :
Signal Processing and Information Technology (ISSPIT), 2015 IEEE International Symposium on
DOI :
10.1109/ISSPIT.2015.7394335