Title :
Budget-Limited Multi-armed Bandit Problem with Dynamic Rewards and Proposed Algorithms
Author :
Makoto Niimi;Takayuki Ito
Author_Institution :
Dept. of Comput. Sci., Nagoya Inst. of Technol., Nagoya, Japan
fDate :
7/1/2015 12:00:00 AM
Abstract :
We focus on budget-limited multi-armed bandit (BL-MAB) problems in which an agent´s actions are costly and constrained by a fixed budget. The existing BL-MAB problems assume that reward distributions are static. In this paper, we assume that rewards are dynamic since it is much more realistic. For example, the effect of online advertising, which is one real-world application of BL-MAB problems, is dynamic for trends and dates. For dynamic situations we propose two new bandit algorithms: D-KUBE and SW-KUBE. D-KUBE uses discounted rate gamma. SW-KUBE uses scope tau. In our experiments, we compared the existing bandit algorithm with our proposed bandit algorithms. The main contributions of this paper can be described as follows: (1) When the reward distributions are static, KUBE, D-KUBE, and SW-KUBE have almost the same results. (2) When the reward distributions are dynamic, our proposed D-KUBE and SW-KUBE outperform KUBE. (3) The total rewards of the abrupt changes case are better than those of the gradual changes case.
Keywords :
"Heuristic algorithms","Electronic mail","Advertising","Market research","Approximation algorithms","Approximation methods","Informatics"
Conference_Titel :
Advanced Applied Informatics (IIAI-AAI), 2015 IIAI 4th International Congress on
Print_ISBN :
978-1-4799-9957-6
DOI :
10.1109/IIAI-AAI.2015.248