Title :
Multitask learning and system combination for automatic speech recognition
Author :
Olivier Siohan;David Rybach
Author_Institution :
Google Inc., New York
Abstract :
In this paper we investigate the performance of an ensemble of convolutional, long short-term memory deep neural networks (CLDNN) on a large vocabulary speech recognition task. To reduce the computational complexity of running multiple recognizers in parallel, we propose instead an early system combination approach which requires the construction of a static decoding network encoding the multiple context-dependent state inventories from the distinct acoustic models. To further reduce the computational load, the hidden units of those models can be shared while keeping the output layers distinct, leading to a multitask training formulation. However in contrast to the traditional multitask training, our formulation uses all predicted outputs leading to a multitask system combination strategy. Results are presented on a Voice Search task designed for children and outperform our current production system.
Keywords :
"Hidden Markov models","Acoustics","Training","Transducers","Speech recognition","Decoding","Speech"
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
DOI :
10.1109/ASRU.2015.7404849