DocumentCode
77972
Title
Bandits With Heavy Tail
Author
Bubeck, Sebastian ; Cesa-Bianchi, Nicolo ; Lugosi, Gabor
Author_Institution
Dept. of Oper. Res. & Financial Eng., Princeton Univ., Princeton, NJ, USA
Volume
59
Issue
11
fYear
2013
fDate
Nov. 2013
Firstpage
7711
Lastpage
7717
Abstract
The stochastic multiarmed bandit problem is well understood when the reward distributions are sub-Gaussian. In this paper, we examine the bandit problem under the weaker assumption that the distributions have moments of order 1 + ε, for some ε ∈ (0,1]. Surprisingly, moments of order 2 (i.e., finite variance) are sufficient to obtain regret bounds of the same order as under sub-Gaussian reward distributions. In order to achieve such regret, we define sampling strategies based on refined estimators of the mean such as the truncated empirical mean, Catoni´s M-estimator, and the median-of-means estimator. We also derive matching lower bounds that also show that the best achievable regret deteriorates when ε <; 1.
Keywords
Gaussian distribution; sampling methods; stochastic processes; heavy-tailed distributions; sampling strategies; stochastic multiarmed bandit problem; subGaussian reward distributions; weaker assumption; Electronic mail; Equations; Indexes; Probability distribution; Random variables; Robustness; Standards; Heavy-tailed distributions; regret bounds; robust estimators; stochastic multi-armed bandit;
fLanguage
English
Journal_Title
Information Theory, IEEE Transactions on
Publisher
ieee
ISSN
0018-9448
Type
jour
DOI
10.1109/TIT.2013.2277869
Filename
6576820
Link To Document