• DocumentCode
    2850317
  • Title

    Learning and sharing in a changing world: Non-Bayesian restless bandit with multiple players

  • Author

    Liu, Haoyang ; Liu, Keqin ; Zhao, Qing

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of California, Davis, CA, USA
  • fYear
    2011
  • fDate
    6-11 Feb. 2011
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    We consider decentralized restless multi-armed bandit problems with unknown dynamics and multiple players. The reward state of each arm transits according to an unknown Markovian rule when it is played and evolves according to an arbitrary unknown random process when it is passive. Players activating the same arm at the same time collide and suffer from reward loss. The objective is to maximize the long-term reward by designing a decentralized arm selection policy to address unknown reward models and collisions among players. A decentralized policy is constructed that achieves a regret with logarithmic order. The result finds applications in communication networks, financial investment, and industrial engineering.
  • Keywords
    Markov processes; game theory; Markovian rule; communication networks; decentralized arm selection policy; financial investment; industrial engineering; multiarmed bandit problems; multiple players; nonBayesian restless bandit; reward models; reward state; History; Indexes; Loss measurement; Markov processes; Random processes; Synchronization; Upper bound;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Theory and Applications Workshop (ITA), 2011
  • Conference_Location
    La Jolla, CA
  • Print_ISBN
    978-1-4577-0360-7
  • Type

    conf

  • DOI
    10.1109/ITA.2011.5743588
  • Filename
    5743588