Title :
Limitations of gradient methods in sequence learning
Author_Institution :
Dept. of Comput. & Inf. Sci., Norwegian Univ. of Sci. & Technol., Trondheim, Norway
Abstract :
Recurrent neural networks, such as the well-known Simple recurrent Network (SRN, J. Elman, 1990), trained to predict their own next input vector offer a promising framework for developing internal representations of environmental structure. Current training techniques focus on the use of different gradient methods and genetic search. These techniques have the advantage of being general, to develop distributed representations and to achieve holistic computation. On the other side their generality does not pay off in terms of learning speed, accuracy or flexibility. In this paper a temporal learning problem is analyzed with respect to traditional online learning approaches. The results show that gradient methods do not offer a way to identify and correct the actual cause of misclassifications and so are prone to be stuck on local maxima.
Keywords :
genetic algorithms; gradient methods; learning (artificial intelligence); recurrent neural nets; Simple recurrent Network; environmental structure; genetic search; gradient methods; internal representations; recurrent neural networks; sequence learning; temporal learning problem; Autonomous agents; Cognitive robotics; Computer networks; Distributed computing; Genetics; Gradient methods; Information science; Intelligent networks; Recurrent neural networks; Robot sensing systems;
Conference_Titel :
Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
Print_ISBN :
981-04-7524-1
DOI :
10.1109/ICONIP.2002.1201918