DocumentCode :
1799343
Title :
Heuristics for multiagent reinforcement learning in decentralized decision problems
Author :
Allen, Martin W. ; Hahn, David ; MacFarland, Douglas C.
Author_Institution :
Comput. Sci. Dept., Univ. of Wisconsin-La Crosse, La Crosse, WI, USA
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
1
Lastpage :
8
Abstract :
Decentralized partially observable Markov decision processes (Dec-POMDPs) model cooperative multiagent scenarios, providing a powerful general framework for team-based artificial intelligence. While optimal algorithms exist for Dec-POMDPs, theoretical and empirical results demonstrate that they are impractical for many problems of real interest. We examine the use of reinforcement learning (RL) as a means to generate adequate, if not optimal, joint policies for Dec-POMDPs. It is easily demonstrated (and expected) that single-agent RL produces results of little joint utility. We therefore investigate heuristic methods, based upon the dynamics of the Dec-POMDP formulation, that bias the learning process to produce coordinated action. Empirical tests on a benchmark problem show that these heuristics significantly enhance learning performance, even out-performing a hand-crafted heuristic in cases where the learning process converges quickly.
Keywords :
Markov processes; learning (artificial intelligence); multi-agent systems; Dec-POMDP model; cooperative multiagent; decentralized decision problem; heuristic method; multiagent reinforcement learning; partially observable Markov decision process; team-based artificial intelligence; Benchmark testing; Complexity theory; Equations; Heuristic algorithms; Joints; Learning (artificial intelligence); Markov processes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014 IEEE Symposium on
Conference_Location :
Orlando, FL
Type :
conf
DOI :
10.1109/ADPRL.2014.7010642
Filename :
7010642
Link To Document :
بازگشت