Title :
Incremental policy learning: an equilibrium selection algorithm for reinforcement learning agents with common interests
Author :
Fulda, Nancy ; Ventura, Dan
Author_Institution :
Dept. of Comput. Sci., Brigham Young Univ., Provo, UT, USA
Abstract :
We present an equilibrium selection algorithm for reinforcement learning agents that incrementally adjusts the probability of executing each action based on the desirability of the outcome obtained in the last time step. The algorithm assumes that at least one coordination equilibrium exists and requires that the agents have a heuristic for determining whether or not the equilibrium was obtained. In deterministic environments with one or more strict coordination equilibria, the algorithm learns to play an optimal equilibrium as long as the heuristic is accurate. Empirical data demonstrate that the algorithm is also effective in stochastic environments and is able to learn good joint policies when the heuristic´s parameters are estimated during learning, rather than known in advance.
Keywords :
learning (artificial intelligence); multi-agent systems; optimisation; probability; stochastic processes; equilibrium selection algorithm; incremental policy learning; optimal equilibrium; probability; reinforcement learning agents; stochastic environments; Computer science; Learning; Minimax techniques; Parameter estimation; Stochastic processes; Taxonomy;
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
Print_ISBN :
0-7803-8359-1
DOI :
10.1109/IJCNN.2004.1380091