DocumentCode :
852905
Title :
A lemma on the multiarmed bandit problem
Author :
Tsitsiklis, John N.
Author_Institution :
Massachusetts Institute of Technology, Cambridge, MA, USA
Volume :
31
Issue :
6
fYear :
1986
fDate :
6/1/1986 12:00:00 AM
Firstpage :
576
Lastpage :
577
Abstract :
We prove a lemma on the optimal value function for the multiarmed bandit problem which provides a simple direct proof of optimality of writeoff policies. This, in turn, leads to a new proof of optimality of the index rule.
Keywords :
Optimal control; Approximation algorithms; Equations; Infinite horizon; Probability distribution; Retirement;
fLanguage :
English
Journal_Title :
Automatic Control, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9286
Type :
jour
DOI :
10.1109/TAC.1986.1104332
Filename :
1104332
Link To Document :
بازگشت