Title :
Bayesian-Game-Based Fuzzy Reinforcement Learning Control for Decentralized POMDPs
Author :
Sharma, Rajneesh ; Spaan, Matthijs T J
Abstract :
This paper proposes a Bayesian-game-based fuzzy reinforcement learning (RL) controller for decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs have recently emerged as a powerful platform for optimizing multiagent sequential decision making in partially observable stochastic environments. However, finding exact optimal solutions to a Dec-POMDP is provably intractable (NEXP-complete), necessitating the use of approximate/suboptimal solution approaches. This approach proposes an approximate solution by employing fuzzy inference systems (FISs) in a game-based RL setting. It uses the powerful universal approximation capability of fuzzy systems to compactly represent a Dec-POMDP as a fuzzy Dec-POMDP, allowing the controller to progressively learn and update an approximate solution to the underlying Dec-POMDP. The proposed controller envisages an FIS-based RL controller for Dec-POMDPs modeled as a sequence of Bayesian games (BGs). We implement the proposed controller for two scenarios: 1) Dec-POMDPs with free communication between agents; and 2) Dec-POMDPs without communication. We empirically evaluate the proposed approach on three standard benchmark problems: 1) multiagent tiger; 2) multiaccess broadcast channel; and 3) recycling robot. Simulation results and comparative evaluation against other Dec-POMDP solution approaches elucidate the effectiveness and feasibility of employing FIS-based game-theoretic RL for designing Dec-POMDP controllers.
Keywords :
Bayes methods; Markov processes; control system synthesis; decentralised control; decision making; fuzzy control; fuzzy reasoning; game theory; learning (artificial intelligence); learning systems; multi-robot systems; observability; stochastic systems; Bayesian-game-based fuzzy reinforcement learning control; Dec-POMDP controller design; NEXP-complete; agent communication; approximate solution; decentralized POMDP; decentralized partially observable Markov decision process; fuzzy inference system; fuzzy system; multiaccess broadcast channel; multiagent sequential decision making optimization; multiagent tiger; partially observable stochastic environment; recycling robot; suboptimal solution; Approximation methods; Bayesian methods; Games; Learning; Robots; Sensors; Uncertainty; Bayesian games (BGs); decentralized partially observable Markov decision processes (Dec-POMDPs); fuzzy systems; reinforcement learning (RL);
Journal_Title :
Computational Intelligence and AI in Games, IEEE Transactions on
DOI :
10.1109/TCIAIG.2012.2212279