مرکز منطقه ای اطلاع رساني علوم و فناوري - Budget-Limited Multi-armed Bandit Problem with Dynamic Rewards and Proposed Algorithms

DocumentCode :

3724423

Title :

Budget-Limited Multi-armed Bandit Problem with Dynamic Rewards and Proposed Algorithms

Author :

Makoto Niimi;Takayuki Ito

Author_Institution :

Dept. of Comput. Sci., Nagoya Inst. of Technol., Nagoya, Japan

fYear :

2015

fDate :

7/1/2015 12:00:00 AM

Firstpage :

540

Lastpage :

545

Abstract :

We focus on budget-limited multi-armed bandit (BL-MAB) problems in which an agent´s actions are costly and constrained by a fixed budget. The existing BL-MAB problems assume that reward distributions are static. In this paper, we assume that rewards are dynamic since it is much more realistic. For example, the effect of online advertising, which is one real-world application of BL-MAB problems, is dynamic for trends and dates. For dynamic situations we propose two new bandit algorithms: D-KUBE and SW-KUBE. D-KUBE uses discounted rate gamma. SW-KUBE uses scope tau. In our experiments, we compared the existing bandit algorithm with our proposed bandit algorithms. The main contributions of this paper can be described as follows: (1) When the reward distributions are static, KUBE, D-KUBE, and SW-KUBE have almost the same results. (2) When the reward distributions are dynamic, our proposed D-KUBE and SW-KUBE outperform KUBE. (3) The total rewards of the abrupt changes case are better than those of the gradual changes case.

Keywords :

"Heuristic algorithms","Electronic mail","Advertising","Market research","Approximation algorithms","Approximation methods","Informatics"

Publisher :

ieee

Conference_Titel :

Advanced Applied Informatics (IIAI-AAI), 2015 IIAI 4th International Congress on

Print_ISBN :

978-1-4799-9957-6

Type :

conf

DOI :

10.1109/IIAI-AAI.2015.248

Filename :

7373967

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3724423