DocumentCode :
323838
Title :
An MRNN-based method for continuous Mandarin speech recognition
Author :
Liao, Yuan-Fu ; Chen, Sin-Horng
Author_Institution :
Dept. of Commun. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Volume :
2
fYear :
1998
fDate :
12-15 May 1998
Firstpage :
1121
Abstract :
A new modular recurrent neural network (MRNN)-based method for continuous Mandarin speech recognition is proposed. The system uses five RNNs to accomplish many subtasks separately and then combine them to integrally solve the problem. They include two RNNs for the discrimination of the two sub-syllable groups of 100 right-final-dependent (RFD) initials and 39 context independent (CI) finals, two RNNs for the generation of dynamic weighting functions for sub-syllable´s integration, and one RNN for syllable boundary detection. All RNN modules are combined using a delay-decision Viterbi search. The method differs from the ANN/HMM hybrid approach of using ANNs to perform not only sub-syllables discrimination but also temporal structure modeling of the speech signal. The system is trained using a three-stage training method embedding with the MCE/GPD algorithms. Besides, a fast recognition method using multi-level pruning is also proposed. Experimental results showed that it outperforms the HMM method on both the recognition accuracy and the computational complexity
Keywords :
backpropagation; computational complexity; natural languages; neural net architecture; recurrent neural nets; search problems; speech processing; speech recognition; ANN/HMM hybrid approach; CI finals; MCE/GPD algorithms; MRNN-based method; RFD initials; RNN modules; computational complexity; context independent finals; continuous Mandarin speech recognition; delay-decision Viterbi search; dynamic weighting functions; error backpropagation algorithm; experimental results; fast recognition method; modular recurrent neural network; multi-level pruning; neural network architecture; recognition accuracy; right-final-dependent initials; speech signal; sub-syllable groups discrimination; sub-syllable integration; syllable boundary detection; temporal structure modeling; three-stage training method; Artificial neural networks; Computational complexity; Contracts; Councils; Delay; Error analysis; Hidden Markov models; Recurrent neural networks; Speech recognition; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
ISSN :
1520-6149
Print_ISBN :
0-7803-4428-6
Type :
conf
DOI :
10.1109/ICASSP.1998.675466
Filename :
675466
Link To Document :
بازگشت