Title :
Q-PSP learning: an exploitation-oriented Q-learning algorithm and its applications
Author :
Horiuchi, Tadashi ; Fujino, Akinori ; Katai, Osamu ; Sawaragi, Tetsuo
Author_Institution :
Graduate Sch. of Eng., Kyoto Univ., Japan
Abstract :
Proposes the Q-PSP learning method, which incorporates the the idea of the profit-sharing plan (PSP), as used in classifier systems, into Q-learning as a type of “exploitation-oriented” reinforcement learning in order to take advantage of the merits of these two approaches. By applying Q-PSP learning to several problems, it is shown that not only can a speed-up in learning be expected, but also the effectiveness for complex problems can be improved, and an appropriate balance between exploration and exploitation can be attained
Keywords :
dynamic programming; genetic algorithms; learning (artificial intelligence); pattern classification; problem solving; Q-PSP learning method; Q-learning algorithm; classifier systems; complex problem solving effectiveness; exploitation-oriented reinforcement learning; exploration; learning speedup; profit-sharing plan; Bismuth; Learning systems; Petroleum;
Conference_Titel :
Evolutionary Computation, 1996., Proceedings of IEEE International Conference on
Conference_Location :
Nagoya
Print_ISBN :
0-7803-2902-3
DOI :
10.1109/ICEC.1996.542337