Title :
A lemma on the multiarmed bandit problem
Author :
Tsitsiklis, John N.
Author_Institution :
Massachusetts Institute of Technology, Cambridge, MA, USA
fDate :
6/1/1986 12:00:00 AM
Abstract :
We prove a lemma on the optimal value function for the multiarmed bandit problem which provides a simple direct proof of optimality of writeoff policies. This, in turn, leads to a new proof of optimality of the index rule.
Keywords :
Optimal control; Approximation algorithms; Equations; Infinite horizon; Probability distribution; Retirement;
Journal_Title :
Automatic Control, IEEE Transactions on
DOI :
10.1109/TAC.1986.1104332