Title :
Self-modifying reinforcement learning
Author_Institution :
IST Res., Ningbo Univ., Zhejiang, China
Abstract :
We describe several experiments with reinforcement learning systems based on the technique of incremental self-improvement (IS). IS uses the success-story algorithm (SSA) to undo unrewarding policy changes computed by self-modifying policies. The experiment demonstrates IS´ advantages over stochastic hill climbing and TD Q-learning in noisy environments given limited computational resources.
Keywords :
learning (artificial intelligence); learning automata; stochastic automata; TD Q-learning; incremental self-improvement; noisy environments; self-modifying reinforcement learning; stochastic hill climbing; success-story algorithm; Acceleration; Genetic algorithms; Learning; Monitoring; Noise measurement; Performance evaluation; Stochastic processes; Testing; Time measurement; Working environment noise;
Conference_Titel :
Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on
Print_ISBN :
0-7803-7508-4
DOI :
10.1109/ICMLC.2002.1175418