DocumentCode
2005741
Title
A Bayesian Learning Automaton for Solving Two-Armed Bernoulli Bandit Problems
Author
Granmo, Ole Christoffer
Author_Institution
Dept. of ICT, Univ. of Agder, Grimstad, Norway
fYear
2008
fDate
11-13 Dec. 2008
Firstpage
23
Lastpage
30
Abstract
The two-armed Bernoulli bandit (TABB) problem is a classical optimization problem where an agent sequentially pulls one of two arms attached to a gambling machine, with each pull resulting either in a reward or a penalty. The reward probabilities of each arm are unknown, and thus one must balance between exploiting existing knowledge about the arms, and obtaining new information. In the last decades, several computationally efficient algorithms for tackling this problem have emerged, with learning automata (LA) being known for their ¿-optimality, and confidence interval based for logarithmically growing regret. Applications include treatment selection in clinical trials, route selection in adaptive routing, and plan exploration in games like Go. The TABB has also been extensively studied from a Bayesian perspective, however, in general, such analysis leads to computationally inefficient solution policies. This paper introduces the Bayesian learning automaton (BLA). The BLA is inherently Bayesian in nature, yet relies simply on counting rewards/penalties and on random sampling from a pair of twin beta distributions. Furthermore, we report that BLA is self-correcting and converges to only pulling the optimal arm with probability 1. Extensive experiments demonstrate that, in contrast to most LA, BLA does not rely on external learning speed/accuracy control. It also outperforms recently proposed confidence interval based algorithms. We thus believe that BLA opens up for improved performance in a number of applications,and that it forms the basis for a new avenue of research.
Keywords
belief networks; learning automata; optimisation; Bayesian learning automaton; optimization problem; twin beta distributions; two-armed Bernoulli bandit problems; Application software; Arm; Artificial intelligence; Bayesian methods; Clinical trials; Learning automata; Machine learning; Resource management; Routing; Sampling methods;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location
San Diego, CA
Print_ISBN
978-0-7695-3495-4
Type
conf
DOI
10.1109/ICMLA.2008.67
Filename
4724951
Link To Document