مرکز منطقه ای اطلاع رساني علوم و فناوري - Monte Carlo preference elicitation for learning additive reward functions

DocumentCode :

2020514

Title :

Monte Carlo preference elicitation for learning additive reward functions

Author :

Rosenthal, Stephanie ; Veloso, Manuela

Author_Institution :

Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear :

2012

fDate :

9-13 Sept. 2012

Firstpage :

886

Lastpage :

891

Abstract :

AI agents including robots often use reward functions to evaluate tradeoffs between different states and actions and to determine optimal policies. We are particularly interested in reward functions that can be decomposed into an additive sum of subrewards that are computed on independent subproblems or features of the state space. If these subrewards capture different reward metrics, such as user satisfaction and task completion time, it is unclear how to scale the subrewards in the reward function to produce an appropriate policy. In this work, we propose and evaluate a novel Monte Carlo method for learning the scaling factors of subrewards, in which the training elicits humans´ preferences between two state-action scenarios. Because the algorithm elicits preferences over explicit scenarios, it is less susceptible to human error than previous elicitation approaches. The preferences are used to generate a set of inequalities over the scaling factors that we solve efficiently using a linear program. We show that our algorithm asks for a number of preferences proportional to log of the number of scaling factor hypotheses used in the Monte Carlo method.

Keywords :

Monte Carlo methods; learning (artificial intelligence); linear programming; multi-agent systems; AI agents; Monte Carlo preference elicitation; additive reward function learning; human preference elicitation; linear program; optimal policy; reward metrics; state space; state-action scenarios; subreward additive sum; subreward scaling factor learning; task completion time; user satisfaction; Additives; Approximation algorithms; Concrete; Humans; Monte Carlo methods; Probabilistic logic; Robots;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

RO-MAN, 2012 IEEE

Conference_Location :

Paris

ISSN :

1944-9445

Print_ISBN :

978-1-4673-4604-7

Electronic_ISBN :

1944-9445

Type :

conf

DOI :

10.1109/ROMAN.2012.6343863

Filename :

6343863

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2020514