• DocumentCode
    3740419
  • Title

    Improving Multi-agent Learners Using Less-Biased Value Estimators

  • Author

    Sherief Abdallah;Michael Kaisers

  • Author_Institution
    Fac. of Eng. &
  • Volume
    2
  • fYear
    2015
  • Firstpage
    120
  • Lastpage
    124
  • Abstract
    Many different value-based or policy-search reinforcement learning algorithms have been applied to multi-agent settings. Value-based learners estimate the expected return (value) for each state-action combination and then derive a policy from these expectations. Policy-search learners optimize the agent´s policy directly by using a parameterized representation of the policy and then optimizing the parameter values to maximize the expected return. While the two classes of algorithms have been considered as contrasting one another, we note that several policy-search algorithms (e.g., Weighted Policy Learner and Infinitesimal Gradient Ascent) need a method for estimating the expected returns. In practice, these policy-search algorithms internally use an update equation for incrementally improving value estimates. In this paper we present the first detailed study of the effect of using different value-based learning algorithms as components of policy-search learners. Our results show that the particular choice can significantly affect performance.
  • Keywords
    "Games","Prediction algorithms","Algorithm design and analysis","Approximation algorithms","Learning (artificial intelligence)","Mathematical model","Electronic mail"
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2015 IEEE / WIC / ACM International Conference on
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2015.113
  • Filename
    7397346