• DocumentCode
    852905
  • Title

    A lemma on the multiarmed bandit problem

  • Author

    Tsitsiklis, John N.

  • Author_Institution
    Massachusetts Institute of Technology, Cambridge, MA, USA
  • Volume
    31
  • Issue
    6
  • fYear
    1986
  • fDate
    6/1/1986 12:00:00 AM
  • Firstpage
    576
  • Lastpage
    577
  • Abstract
    We prove a lemma on the optimal value function for the multiarmed bandit problem which provides a simple direct proof of optimality of writeoff policies. This, in turn, leads to a new proof of optimality of the index rule.
  • Keywords
    Optimal control; Approximation algorithms; Equations; Infinite horizon; Probability distribution; Retirement;
  • fLanguage
    English
  • Journal_Title
    Automatic Control, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9286
  • Type

    jour

  • DOI
    10.1109/TAC.1986.1104332
  • Filename
    1104332