DocumentCode :
1749701
Title :
Modular neural networks exploit large acoustic context through broad-class posteriors for continuous speech recognition
Author :
Antoniou, Christos
Author_Institution :
Dept. of Comput. Sci., Essex Univ., Colchester, UK
Volume :
1
fYear :
2001
fDate :
2001
Firstpage :
505
Abstract :
Traditionally, neural networks such as multi-layer perceptrons handle acoustic context by increasing the dimensionality of the observation vector, in order to include information of the neighbouring acoustic vectors, on either side of the current frame. As a result the monolithic network is trained on a high multi-dimensional space. The trend is to use the same fixed-size observation vector across the one network that estimates the posterior probabilities for all phones, simultaneously. We propose a decomposition of the network into modular components, where each component estimates a phone posterior. The size of the observation vector we use, is not fixed across the modularised networks, but rather accounts for the phone that each network is trained to classify. For each observation vector, we estimate very large acoustic context through broad-class posteriors. The use of the broad-class posteriors along with the phone posteriors greatly enhance acoustic modelling. We report significant improvements in phone classification and word recognition on the TIMIT corpus. Our results are also better than the best context-dependent system in the literature
Keywords :
Viterbi decoding; hidden Markov models; multilayer perceptrons; speech recognition; TIMIT corpus; acoustic modelling; acoustic vectors; broad-class posteriors; continuous speech recognition; dimensionality; large acoustic context; modular neural networks; monolithic network; multi-layer perceptrons; phone classification; phone posteriors; word recognition; Computer architecture; Computer science; Detectors; Hidden Markov models; Multi-layer neural network; Multilayer perceptrons; Neural networks; Speech processing; Speech recognition; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
ISSN :
1520-6149
Print_ISBN :
0-7803-7041-4
Type :
conf
DOI :
10.1109/ICASSP.2001.940878
Filename :
940878
Link To Document :
بازگشت