Title of article :
Multiagent learning using a variable learning rate Original Research Article
Author/Authors :
J. Michael Bowling، نويسنده , , Manuela Veloso، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2002
Pages :
36
From page :
215
To page :
250
Abstract :
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the other agents. This creates a situation of learning a moving target. Previous learning algorithms have one of two shortcomings depending on their approach. They either converge to a policy that may not be optimal against the specific opponentsʹ policies, or they may not converge at all. In this article we examine this learning problem in the framework of stochastic games. We look at a number of previous learning algorithms showing how they fail at one of the above criteria. We then contribute a new reinforcement learning technique using a variable learning rate to overcome these shortcomings. Specifically, we introduce the WoLF principle, “Win or Learn Fast”, for varying the learning rate. We examine this technique theoretically, proving convergence in self-play on a restricted class of iterated matrix games. We also present empirical results on a variety of more general stochastic games, in situations of self-play and otherwise, demonstrating the wide applicability of this method.
Keywords :
Multiagent learning , Game theory , Reinforcement learning
Journal title :
Artificial Intelligence
Serial Year :
2002
Journal title :
Artificial Intelligence
Record number :
1207108
Link To Document :
بازگشت