DocumentCode
2850317
Title
Learning and sharing in a changing world: Non-Bayesian restless bandit with multiple players
Author
Liu, Haoyang ; Liu, Keqin ; Zhao, Qing
Author_Institution
Dept. of Electr. & Comput. Eng., Univ. of California, Davis, CA, USA
fYear
2011
fDate
6-11 Feb. 2011
Firstpage
1
Lastpage
7
Abstract
We consider decentralized restless multi-armed bandit problems with unknown dynamics and multiple players. The reward state of each arm transits according to an unknown Markovian rule when it is played and evolves according to an arbitrary unknown random process when it is passive. Players activating the same arm at the same time collide and suffer from reward loss. The objective is to maximize the long-term reward by designing a decentralized arm selection policy to address unknown reward models and collisions among players. A decentralized policy is constructed that achieves a regret with logarithmic order. The result finds applications in communication networks, financial investment, and industrial engineering.
Keywords
Markov processes; game theory; Markovian rule; communication networks; decentralized arm selection policy; financial investment; industrial engineering; multiarmed bandit problems; multiple players; nonBayesian restless bandit; reward models; reward state; History; Indexes; Loss measurement; Markov processes; Random processes; Synchronization; Upper bound;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Theory and Applications Workshop (ITA), 2011
Conference_Location
La Jolla, CA
Print_ISBN
978-1-4577-0360-7
Type
conf
DOI
10.1109/ITA.2011.5743588
Filename
5743588
Link To Document