مرکز منطقه ای اطلاع رساني علوم و فناوري - Solving Problems with Extended Reachability Goals through Reinforcement Learning on Propositionally Constrained State Spaces

DocumentCode :

3040287

Title :

Solving Problems with Extended Reachability Goals through Reinforcement Learning on Propositionally Constrained State Spaces

Author :

de Araujo, Anderson V. ; Ribeiro, Carlos H. C.

Author_Institution :

Div. de Cienc. da Comput., Inst. Tecnol. de Aeronaut.- ITA, Sao Jose dos Campos, Brazil

fYear :

2013

fDate :

13-16 Oct. 2013

Firstpage :

1542

Lastpage :

1547

Abstract :

Finding a near-optimal action policy towards a goal state can be a complex task for intelligent autonomous agents, especially in a model-free environment with unknown rewards and under state space constraints. In such a situation, it is not possible to plan ahead which is the best action to execute at each moment, and to discover the states that can be visited during the plan execution requires foreknowing the conditions to be preserved for each environment state. We present here a new approach to discover the action policy for an environment under propositional constraints on states in MDP problems. The constraints are used by a strong probabilistic planning algorithm to reduce a state space whose transition probabilities are estimated by an action-learning reinforcement learning algorithm, thus simplifying the agent´s state space exploration and helping in the definition of the planning problem. The execution constraints, or preservation goals, comprised within the representation of the final goal, composes the extended reach ability goals. Experiments to validate the proposal were performed on an antenna coverage problem and produced interesting and promising results, demonstrating fast convergence to condition-preserving near-optimal policies that keep valid a set of propositions while reaching a final goal.

Keywords :

Markov processes; learning (artificial intelligence); multi-agent systems; probability; MDP problems; Markov decision process; action-learning reinforcement learning algorithm; agent state space exploration; antenna coverage problem; condition-preserving near-optimal policies; execution constraints; extended reachability goals; intelligent autonomous agents; model-free environment; near-optimal action policy; preservation goals; problem solving; propositionally constrained state space; strong probabilistic planning algorithm; transition probabilities; unknown rewards; Antennas; Convergence; Learning (artificial intelligence); Markov processes; Planning; Probabilistic logic; Standards; Agent Learning; Agents; Extended Reachability Goals; Markov Decision Processes; Planning; Q-Learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on

Conference_Location :

Manchester

Type :

conf

DOI :

10.1109/SMC.2013.266

Filename :

6722019

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3040287