• DocumentCode
    77972
  • Title

    Bandits With Heavy Tail

  • Author

    Bubeck, Sebastian ; Cesa-Bianchi, Nicolo ; Lugosi, Gabor

  • Author_Institution
    Dept. of Oper. Res. & Financial Eng., Princeton Univ., Princeton, NJ, USA
  • Volume
    59
  • Issue
    11
  • fYear
    2013
  • fDate
    Nov. 2013
  • Firstpage
    7711
  • Lastpage
    7717
  • Abstract
    The stochastic multiarmed bandit problem is well understood when the reward distributions are sub-Gaussian. In this paper, we examine the bandit problem under the weaker assumption that the distributions have moments of order 1 + ε, for some ε ∈ (0,1]. Surprisingly, moments of order 2 (i.e., finite variance) are sufficient to obtain regret bounds of the same order as under sub-Gaussian reward distributions. In order to achieve such regret, we define sampling strategies based on refined estimators of the mean such as the truncated empirical mean, Catoni´s M-estimator, and the median-of-means estimator. We also derive matching lower bounds that also show that the best achievable regret deteriorates when ε <; 1.
  • Keywords
    Gaussian distribution; sampling methods; stochastic processes; heavy-tailed distributions; sampling strategies; stochastic multiarmed bandit problem; subGaussian reward distributions; weaker assumption; Electronic mail; Equations; Indexes; Probability distribution; Random variables; Robustness; Standards; Heavy-tailed distributions; regret bounds; robust estimators; stochastic multi-armed bandit;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2013.2277869
  • Filename
    6576820