DocumentCode :
179550
Title :
Phone sequence modeling with recurrent neural networks
Author :
Boulanger-Lewandowski, Nicolas ; Droppo, Jasha ; Seltzer, Mike ; Dong Yu
Author_Institution :
Dept. IRO Montreal, Univ. de Montreal, Montreal, QC, Canada
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
5417
Lastpage :
5421
Abstract :
In this paper, we investigate phone sequence modeling with recurrent neural networks in the context of speech recognition. We introduce a hybrid architecture that combines a phonetic model with an arbitrary frame-level acoustic model and we propose efficient algorithms for training, decoding and sequence alignment. We evaluate the advantage of our phonetic model on the TIMIT and Switchboard-mini datasets in complementarity to a powerful context-dependent deep neural network (DNN) acoustic classifier and a higher-level 3-gram language model. Consistent improvements of 2-10% in phone accuracy and 3% in word error rate suggest that our approach can readily replace HMMs in current state-of-the-art systems.
Keywords :
acoustic signal processing; error statistics; hidden Markov models; recurrent neural nets; signal classification; speech recognition; DNN; HMM; Switchboard-mini datasets; TIMIT datasets; acoustic classifier; arbitrary frame-level acoustic model; context-dependent deep neural network; decoding; hidden Markov model; higher-level 3-gram language model; hybrid architecture; phone accuracy; phone sequence modeling; phonetic model; recurrent neural networks; sequence alignment; speech recognition; word error rate; Acoustics; Context modeling; Hidden Markov models; Mathematical model; Recurrent neural networks; Speech recognition; Training; Recurrent neural network; phonetic model; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6854638
Filename :
6854638
Link To Document :
بازگشت