Title :
Weight groupings in the training of recurrent networks
Author :
Chan, Lai-Wan ; Szeto, Chi-Cheong
Author_Institution :
Comput. Sci. & Eng. Dept., Chinese Univ. of Hong Kong, Shatin, Hong Kong
Abstract :
We use the block-diagonal matrix to approximate the Hessian matrix in the Levenberg Marquardt method for the training of recurrent neural networks. Substantial improvement of the training time over the original Levenberg Marquardt method is observed without degrading the generalization ability. Three weight grouping methods, correlation blocks, k-unit blocks and layer blocks were investigated and compared. Their computational complexity, approximation ability, and training time are analyzed
Keywords :
Hessian matrices; computational complexity; generalisation (artificial intelligence); learning (artificial intelligence); recurrent neural nets; Hessian matrix; Levenberg Marquardt method; approximation; block-diagonal matrix; computational complexity; correlation blocks; generalization; learning time; recurrent neural networks; weight grouping; Approximation methods; Computer science; Degradation; Equations; Intelligent networks; Jacobian matrices; Learning systems; Matrix decomposition; Neurons; Recurrent neural networks;
Conference_Titel :
Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on
Conference_Location :
Como
Print_ISBN :
0-7695-0619-4
DOI :
10.1109/IJCNN.2000.861275