DocumentCode :
321199
Title :
Minimax lower bounds for the two-armed bandit problem
Author :
Kulkarni, Sanjeev R. ; Lugosi, Gabor
Author_Institution :
Dept. of Electr. Eng., Princeton Univ., NJ, USA
Volume :
3
fYear :
1997
fDate :
10-12 Dec 1997
Firstpage :
2293
Abstract :
We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a finite-sample minimax version of the well-known log n asymptotic lower bound of Lai and Robbins (1985). Also, in contrast to the log n asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for every allocation rule and for every n, there is a configuration such that the regret at time n is at least 1-ε times the regret of random guessing, where ε is any small positive constant
Keywords :
minimax techniques; random processes; asymptotic lower bound; finite-sample minimax version; minimax lower bounds; minimax regret; random guessing; two-armed bandit problem; Arm; Density measurement; Minimax techniques;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Decision and Control, 1997., Proceedings of the 36th IEEE Conference on
Conference_Location :
San Diego, CA
ISSN :
0191-2216
Print_ISBN :
0-7803-4187-2
Type :
conf
DOI :
10.1109/CDC.1997.657117
Filename :
657117
Link To Document :
بازگشت