DocumentCode :
3744877
Title :
An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework
Author :
Jun Du;Qing Wang;Yan-Hui Tu;Xiao Bao;Li-Rong Dai;Chin-Hui Lee
Author_Institution :
University of Science and Technology of China, Hefei, Anhui, P. R. China
fYear :
2015
Firstpage :
430
Lastpage :
435
Abstract :
We present an information fusion approach to robust recognition of microphone array speech for the recently launched 3rd CHiME Challenge. It is based on a deep learning framework with a large neural network consisting of subnets with different architectures. Multiple knowledge sources are integrated via an early fusion of normalized noisy features with different beamforming techniques, speech enhanced features, speaker related features, and other auxiliary features concatenated as the input to each subnet, and a late fusion by combining the outputs of all subnets to produce one single output set. Our experiments demonstrate that all information sources are complementary in our proposed framework. Our best system achieves an average word error rate reduction of 68% from the officially released baseline results on the test set of real data.
Keywords :
"Array signal processing","Training","Microphone arrays","Noise measurement","Speech","Speech recognition"
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
Type :
conf
DOI :
10.1109/ASRU.2015.7404827
Filename :
7404827
Link To Document :
بازگشت