• DocumentCode
    1535710
  • Title

    Optimally Sensing a Single Channel Without Prior Information: The Tiling Algorithm and Regret Bounds

  • Author

    Filippi, Sarah ; Cappé, Olivier ; Garivier, Aurélien

  • Author_Institution
    LTCI, TELECOM ParisTech, Paris, France
  • Volume
    5
  • Issue
    1
  • fYear
    2011
  • Firstpage
    68
  • Lastpage
    76
  • Abstract
    We consider the task of optimally sensing a two-state Markovian channel with an observation cost and without any prior information regarding the channel´s transition probabilities. This task is of interest in the field of cognitive radio as a model for opportunistic access to a communication network by a secondary user. The optimal sensing problem may be cast into the framework of model-based reinforcement learning in a specific class of partially observable Markov decision processes (POMDPs). We propose the Tiling Algorithm, an original method aimed at reaching an optimal tradeoff between the exploration (or estimation) and exploitation requirements. It is shown that this algorithm achieves finite horizon regret bounds that are as good as those recently obtained for multi-armed bandits and finite-state Markov decision processes (MDPs).
  • Keywords
    Markov processes; cognitive radio; learning (artificial intelligence); wireless channels; Markov decision processes; channel sensing; channel transition probability; cognitive radio; finite state Markov decision process; model-based reinforcement learning; multiarmed bandit; tiling algorithm; two-state Markovian channel; Cognitive radio; opportunistic channel access; partially observable Markov decision processes (POMDPs); regret bounds; reinforcement learning; restless bandit;
  • fLanguage
    English
  • Journal_Title
    Selected Topics in Signal Processing, IEEE Journal of
  • Publisher
    ieee
  • ISSN
    1932-4553
  • Type

    jour

  • DOI
    10.1109/JSTSP.2010.2058091
  • Filename
    5510097