مرکز منطقه ای اطلاع رساني علوم و فناوري - Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees

DocumentCode :

2110606

Title :

Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees

Author :

Cilden, Erkin ; Polat, Faruk

Author_Institution :

Dept. of Comput. Eng., Middle East Tech. Univ., Ankara, Turkey

Volume :

fYear :

2012

fDate :

4-7 Dec. 2012

Firstpage :

348

Lastpage :

355

Abstract :

Extended sequence tree is a direct method for automatic generation of useful abstractions in reinforcement learning, designed for problems that can be modelled as Markov decision process. This paper proposes a method to expand the extended sequence tree method over reinforcement learning to cover partial observability formalized via partially observable Markov decision process through belief state formalism. This expansion requires a reasonable approximation of information state. Inspired by statistical ranking, a simple but effective discretization schema over belief state space is defined. Extended sequence tree method is modified to make use of this schema under partial observability, and effectiveness of resulting algorithm is shown by experiments on some benchmark problems.

Keywords :

Markov processes; learning (artificial intelligence); observability; sequences; tree data structures; automatic generation; belief state space; benchmark problems; discretization schema; extended sequence tree method; model-based partially observable reinforcement learning; partially observable Markov decision process; reasonable information state approximation; statistical ranking; extended sequence tree; learning abstractions; partially observable markov decision process; reinforcement learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on

Conference_Location :

Macau

Print_ISBN :

978-1-4673-6057-9

Type :

conf

DOI :

10.1109/WI-IAT.2012.161

Filename :

6511592

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2110606