مرکز منطقه ای اطلاع رساني علوم و فناوري - Gain-based Exploration: From Multi-armed Bandits to Partially Observable Environments

DocumentCode :

1583495

Title :

Gain-based Exploration: From Multi-armed Bandits to Partially Observable Environments

Author :

Si, Bailu ; Herrmann, Michael J. ; Pawelzik, Klaus

Author_Institution :

Univ. of Bremen, Bremen

Volume :

fYear :

2007

Firstpage :

177

Lastpage :

182

Abstract :

We introduce gain-based policies for exploration in active learning problems. For exploration in multi-armed bandits with the knowledge of reward variances, an ideal gain-maximization exploration policy is described in a unified framework which also includes error-based and counter-based exploration. For realistic situations without prior knowledge of reward variances, we establish an upper bound on the gain function, resulting in a realistic gain- maximization exploration policy which achieves the optimal exploration asymptotically. Finally, we extend the gain- maximization exploration scheme towards partially observable environments. Approximating the environment by a set of local bandits, the agent actively selects its actions by maximizing discounted gain in learning local bandits. The resulting gain-based exploration not only outperforms random exploration, but also produces curiosity-driven behavior which is observed in natural agents.

Keywords :

decision making; knowledge acquisition; learning (artificial intelligence); counter-based exploration; error-based exploration; gain-maximization exploration policy; knowledge acquisition; multi-armed bandits; optimal exploration asymptotically; partially observable environments; Decision making; Entropy; Estimation error; Gain measurement; Knowledge acquisition; Learning; Redundancy; Robots; Testing; Upper bound;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Natural Computation, 2007. ICNC 2007. Third International Conference on

Conference_Location :

Haikou

Print_ISBN :

978-0-7695-2875-5

Type :

conf

DOI :

10.1109/ICNC.2007.395

Filename :

4344177

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1583495