مرکز منطقه ای اطلاع رساني علوم و فناوري - Convolutional maxout neural networks for speech separation

DocumentCode :

3738512

Title :

Convolutional maxout neural networks for speech separation

Author :

Like Hui;Meng Cai;Cong Guo;Liang He;Wei-Qiang Zhang;Jia Liu

Author_Institution :

Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

fYear :

2015

Firstpage :

Lastpage :

Abstract :

Speech separation based on deep neural networks (DNNs) has been widely studied recently, and has achieved considerable success. However, previous studies are mostly based on fully-connected neural networks. In order to capture the local information of speech signals, we propose to use convolutional maxout neural networks (CMNNs) to separate speech and noise by estimating the ideal ratio mask of the time-frequency units. In our work the proposed CMNN is applied in the frequency domain. By using local filtering and max-pooling, convolutional neural networks can model the local structure of speech signals. Instead of sigmoid function, maxout is selected to address the saturation problem. In addition, dropout is integrated into the network to get better generalization ability. The proposed system outperforms a traditional DNN-based system in both objective speech quality and intelligibility.

Keywords :

"Speech","Convolution","Training","Signal to noise ratio","Frequency-domain analysis","Feature extraction","Neural networks"

Publisher :

ieee

Conference_Titel :

Signal Processing and Information Technology (ISSPIT), 2015 IEEE International Symposium on

Type :

conf

DOI :

10.1109/ISSPIT.2015.7394335

Filename :

7394335

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3738512