Title :
Multi-agent Q-learning and regression trees for automated pricing decisions
Author :
Sridharan, Manu ; Tesauro, Gerald
Author_Institution :
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
Abstract :
We study the use of the reinforcement learning algorithm Q-learning with regression tree function approximation to learn pricing strategies in a competitive marketplace of economic software agents. Q-learning is an algorithm for learning to estimate the long-term expected reward for a given state-action pair. In the case of a stationary environment with a lookup table representing the Q-function, the learning procedure is guaranteed to converge to an optimal policy. However, utilizing Q-learning in multi-agent systems presents special challenges. The simultaneous adaptation of multiple agents creates a non-stationary environment for each agent, hence there are no theoretical guarantees of convergence or optimality. Also, large multi-agent systems may have state spaces too large to represent with lookup tables, necessitating the use of function approximation
Keywords :
costing; economics; function approximation; learning (artificial intelligence); multi-agent systems; statistical analysis; table lookup; trees (mathematics); automated pricing decisions; convergence; economic software agents; function approximation; long-term expected reward; lookup table; multi-agent Q-learning; optimal policy; regression trees; reinforcement learning; Approximation algorithms; Environmental economics; Function approximation; Learning; Multiagent systems; Pricing; Regression tree analysis; Software agents; Software algorithms; Table lookup;
Conference_Titel :
MultiAgent Systems, 2000. Proceedings. Fourth International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
0-7695-0625-9
DOI :
10.1109/ICMAS.2000.858518