DocumentCode :
3744889
Title :
Single and multi-channel approaches for distant speech recognition under noisy reverberant conditions: I2R´S system description for the ASpIRE challenge
Author :
Jonathan Dennis;Tran Huy Dat
Author_Institution :
Institute for Infocomm Research, A?STAR, 1 Fusionopolis Way, Singapore 138632
fYear :
2015
Firstpage :
518
Lastpage :
524
Abstract :
In this paper, we introduce the system developed at the Institute for Infocomm Research (I2 R) for the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge. The main components of the system are a front-end processing system consisting of a distributed beam-forming algorithm, that performs adaptive weighting and channel elimination, a speech dereverberation approach using a maximum-kurtosis criteria, and a robust voice activity detection (VAD) module based on using the sub-harmonic ratio (SHR). The acoustic back-end consists of a multi-conditional Deep Neural Network (DNN) model that uses speaker adapted features combined with a decoding strategy that performs semi-supervised DNN model adaptation using weighted labels generated by the first-pass decoding output. On the single-microphone evaluation, our system achieved a word error rate (WER) of 44.8%. With the incorporation of beamforming on the multi-microphone evaluation, our system achieved an improvement in WER of over 6% to give the best evaluation result of 38.5%.
Keywords :
"Speech","Training","Speech recognition","Acoustics","Adaptation models","Robustness","Testing"
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
Type :
conf
DOI :
10.1109/ASRU.2015.7404839
Filename :
7404839
Link To Document :
بازگشت