مرکز منطقه ای اطلاع رساني علوم و فناوري - Eliciting preferences over observed behaviours based on relative evaluations

DocumentCode :

2340430

Title :

Eliciting preferences over observed behaviours based on relative evaluations

Author :

Da Silva, Valdinei Freire ; Lima, Pedro ; Costa, Anna Helena Reali

Author_Institution :

Univ. of Sao Paulo, Sao Paulo

fYear :

2007

fDate :

Oct. 29 2007-Nov. 2 2007

Firstpage :

423

Lastpage :

428

Abstract :

Reinforcement learning addresses the question of programming an autonomous agent to execute tasks that are described as reinforcement functions. Then, the agent is responsible for discovering the best actions to fulfil such task. Most of the work on reinforcement learning considers that reinforcements are given by the environment, not addressing the problem of how to describe tasks as reinforcement functions. Preference elicitation addresses the problem of describing a human preference through utility functions, from which reinforcement functions are special cases. This paper proposes an approach where preference elicitation and reinforcement learning are handled in an integrated manner, providing an autonomous method of programming an agent. The agent is programmed through pairwise evaluations over observed behaviours of the agent, where the evaluations are summarised in the reinforcement function. In this paper we present an approach to solve such a problem based on evaluations over observed behaviours. We propose a new algorithm, PEOB-RS, that can be shown to converge towards an optimal policy, providing the number of trials for each behaviour tends to infinity. Experimental results from learning in a grid stochastic environment are used to obtain a reinforcement function, illustrating the effectiveness of PEOB-RS, even if requiring too many evaluations. Such reinforcement function is then transferred to a more real-like environment simulating a pioneer robot, showing the abstraction property of utility functions.

Keywords :

learning (artificial intelligence); mobile robots; robot programming; autonomous agent programming; grid stochastic environment; pioneer robot; preference elicitation; reinforcement learning; utility function; Autonomous agents; Functional programming; Intelligent robots; Learning; Notice of Violation; Programming profession; Robot programming; Stochastic processes; USA Councils; Utility theory;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Robots and Systems, 2007. IROS 2007. IEEE/RSJ International Conference on

Conference_Location :

San Diego, CA

Print_ISBN :

978-1-4244-0912-9

Electronic_ISBN :

978-1-4244-0912-9

Type :

conf

DOI :

10.1109/IROS.2007.4399403

Filename :

4399403

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2340430