• DocumentCode
    3613838
  • Title

    On the mean-square rate of convergence of temporal-difference learning algorithms

  • Author

    V.B. Tadic

  • Author_Institution
    Dept. of Electr. & Electron. Eng., Melbourne Univ., Parkville, Vic., Australia
  • Volume
    2
  • fYear
    2002
  • fDate
    6/24/1905 12:00:00 AM
  • Firstpage
    1454
  • Abstract
    In this paper, the mean-square rate of convergence of temporal-difference learning algorithms is analyzed. The analysis is carried out for the case of discounted cost function associated with a Markov chain with a finite dimensional state-space. Under mild conditions, it is shown that these algorithms converge at the rate O(n/sup -1/2/). The results are illustrated with examples related to random coefficient autoregression models and M/G/1 queues.
  • Keywords
    "Convergence","Cost function","Function approximation","Algorithm design and analysis","Automatic control","Approximation error","Stochastic processes","Predictive models","Performance analysis","Australia Council"
  • Publisher
    ieee
  • Conference_Titel
    American Control Conference, 2002. Proceedings of the 2002
  • ISSN
    0743-1619
  • Print_ISBN
    0-7803-7298-0
  • Type

    conf

  • DOI
    10.1109/ACC.2002.1023226
  • Filename
    1023226