DocumentCode
3484077
Title
Limitations of gradient methods in sequence learning
Author
Federici, Diego
Author_Institution
Dept. of Comput. & Inf. Sci., Norwegian Univ. of Sci. & Technol., Trondheim, Norway
Volume
5
fYear
2002
fDate
18-22 Nov. 2002
Firstpage
2369
Abstract
Recurrent neural networks, such as the well-known Simple recurrent Network (SRN, J. Elman, 1990), trained to predict their own next input vector offer a promising framework for developing internal representations of environmental structure. Current training techniques focus on the use of different gradient methods and genetic search. These techniques have the advantage of being general, to develop distributed representations and to achieve holistic computation. On the other side their generality does not pay off in terms of learning speed, accuracy or flexibility. In this paper a temporal learning problem is analyzed with respect to traditional online learning approaches. The results show that gradient methods do not offer a way to identify and correct the actual cause of misclassifications and so are prone to be stuck on local maxima.
Keywords
genetic algorithms; gradient methods; learning (artificial intelligence); recurrent neural nets; Simple recurrent Network; environmental structure; genetic search; gradient methods; internal representations; recurrent neural networks; sequence learning; temporal learning problem; Autonomous agents; Cognitive robotics; Computer networks; Distributed computing; Genetics; Gradient methods; Information science; Intelligent networks; Recurrent neural networks; Robot sensing systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
Print_ISBN
981-04-7524-1
Type
conf
DOI
10.1109/ICONIP.2002.1201918
Filename
1201918
Link To Document