Title :
Deep maxout neural networks for speech recognition
Author :
Meng Cai ; Yongzhe Shi ; Jia Liu
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Abstract :
A recently introduced type of neural network called maxout has worked well in many domains. In this paper, we propose to apply maxout for acoustic models in speech recognition. The maxout neuron picks the maximum value within a group of linear pieces as its activation. This nonlinearity is a generalization to the rectified nonlinearity and has the ability to approximate any form of activation functions. We apply maxout networks to the Switchboard phone-call transcription task and evaluate the performances under both a 24-hour low-resource condition and a 300-hour core condition. Experimental results demonstrate that maxout networks converge faster, generalize better and are easier to optimize than rectified linear networks and sigmoid networks. Furthermore, experiments show that maxout networks reduce underfitting and are able to achieve good results without dropout training. Under both conditions, maxout networks yield relative improvements of 1.1-5.1% over rectified linear networks and 2.6-14.5% over sigmoid networks on benchmark test sets.
Keywords :
acoustic signal processing; learning (artificial intelligence); network theory (graphs); neural nets; speech recognition; acoustic models; benchmark test sets; core condition; deep maxout neural networks; low-resource condition; maxout neuron; performance evaluation; rectified linear networks; rectified nonlinearity; sigmoid networks; speech recognition; switchboard phone-call transcription task; time 24 hour; time 300 hour; Acoustics; Biological neural networks; Hidden Markov models; Neurons; Speech recognition; Switches; Training; Maxout networks; acoustic modeling; neuron nonlinearity; speech recognition;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location :
Olomouc
DOI :
10.1109/ASRU.2013.6707745