Title :
Investigation of deep Boltzmann machines for phone recognition
Author :
Zhao You ; Xiaorui Wang ; Bo Xu
Author_Institution :
Interactive Digital Media Technol. Res. Center, Inst. of Autom., Beijing, China
Abstract :
In the past few years, deep neural networks (DNNs) achieved great successes in speech recognition. The layer-wise pre-trained deep belief network (DBN) is known as one of the critical factor to optimize the DNN. However, the DBN has one shortcoming that the pre-training procedure is in a greedy forward pass. The top-down influences on the inference process are ignored, thus the pre-trained DBN is suboptimal. In this paper, we attempt to apply deep Boltzmann machine (DBM) on acoustic modeling. DBM has the advantages that a top-down feedback is incorporated and the parameters of all layers can be jointly optimized. Experiments are conducted on the TIMIT phone recognition task to investigate the DBM-DNN acoustic model. Comparing with the DBN-DNN with same amount of parameters, phone error rate on the core test set is reduced by 3.8% relatively, and additional 5.1% by dropout fine-tuning.
Keywords :
Boltzmann machines; belief networks; greedy algorithms; speech recognition; DBM-DNN acoustic model; TIMIT phone recognition; deep Boltzmann machine; deep neural network; greedy forward pass; inference process; layer-wise pretrained deep belief network; phone error rate; speech recognition; Acoustics; Data models; Maximum likelihood decoding; Neural networks; Speech; Speech recognition; Training; Deep Boltzmann Machines; Deep Neural Networks; acoustic modeling; phone recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639141