• DocumentCode
    714186
  • Title

    An estimation based allocation rule with super-linear regret and finite lock-on time for time-dependent multi-armed bandit processes

  • Author

    Prokopiou, Prokopis C. ; Caines, Peter E. ; Mahajan, Aditya

  • Author_Institution
    Integrated Program in Neurosci., McGill Univ., Montreal, QC, Canada
  • fYear
    2015
  • fDate
    3-6 May 2015
  • Firstpage
    1299
  • Lastpage
    1306
  • Abstract
    The multi-armed bandit (MAB) problem has been an active area of research since the early 1930s. The majority of the literature restricts attention to i.i.d. or Markov reward processes. In this paper, the finite-parameter MAB problem with time-dependent reward processes is investigated. An upper confidence bound (UCB) based index policy, where the index is computed based on the maximum-likelihood estimate of the unknown parameter, is proposed. This policy locks on to the optimal arm in finite expected time but has a super-linear regret. As an example, the proposed index policy is used for minimizing prediction error when each arm is a auto-regressive moving average (ARMA) process.
  • Keywords
    Markov processes; autoregressive moving average processes; maximum likelihood estimation; resource allocation; ARMA process; Markov reward process; UCB based index policy; auto-regressive moving average process; estimation based allocation rule; finite expected time; finite lock-on time; finite-parameter MAB problem; maximum-likelihood estimation; prediction error minimizaion; superlinear regret; time-dependent multiarmed bandit process; time-dependent reward process; upper confidence bound based index policy; Computers; Indexes; Markov processes; Maximum likelihood estimation; Resource management; Technological innovation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical and Computer Engineering (CCECE), 2015 IEEE 28th Canadian Conference on
  • Conference_Location
    Halifax, NS
  • ISSN
    0840-7789
  • Print_ISBN
    978-1-4799-5827-6
  • Type

    conf

  • DOI
    10.1109/CCECE.2015.7129466
  • Filename
    7129466