DocumentCode
852905
Title
A lemma on the multiarmed bandit problem
Author
Tsitsiklis, John N.
Author_Institution
Massachusetts Institute of Technology, Cambridge, MA, USA
Volume
31
Issue
6
fYear
1986
fDate
6/1/1986 12:00:00 AM
Firstpage
576
Lastpage
577
Abstract
We prove a lemma on the optimal value function for the multiarmed bandit problem which provides a simple direct proof of optimality of writeoff policies. This, in turn, leads to a new proof of optimality of the index rule.
Keywords
Optimal control; Approximation algorithms; Equations; Infinite horizon; Probability distribution; Retirement;
fLanguage
English
Journal_Title
Automatic Control, IEEE Transactions on
Publisher
ieee
ISSN
0018-9286
Type
jour
DOI
10.1109/TAC.1986.1104332
Filename
1104332
Link To Document