مرکز منطقه ای اطلاع رساني علوم و فناوري - A new Q-learning algorithm based on the metropolis criterion

DocumentCode :

1107492

Title :

A new Q-learning algorithm based on the metropolis criterion

Author :

Guo, Maozu ; Liu, Yang ; Malec, Jacek

Author_Institution :

Dept. of Comput. Sci. & Eng., Harbin Inst. of Technol., China

Volume :

Issue :

fYear :

2004

Firstpage :

2140

Lastpage :

2143

Abstract :

The balance between exploration and exploitation is one of the key problems of action selection in Q-learning. Pure exploitation causes the agent to reach the locally optimal policies quickly, whereas excessive exploration degrades the performance of the Q-learning algorithm even if it may accelerate the learning process and allow avoiding the locally optimal policies. In this paper, finding the optimum policy in Q-learning is described as search for the optimum solution in combinatorial optimization. The Metropolis criterion of simulated annealing algorithm is introduced in order to balance exploration and exploitation of Q-learning, and the modified Q-learning algorithm based on this criterion, SA-Q-learning, is presented. Experiments show that SA-Q-learning converges more quickly than Q-learning or Boltzmann exploration, and that the search does not suffer of performance degradation due to excessive exploration.

Keywords :

combinatorial mathematics; learning (artificial intelligence); multi-agent systems; search problems; simulated annealing; Metropolis criterion; Q-learning algorithm; action selection; combinatorial optimization; optimum policy; reinforcement learning; simulated annealing algorithm; Accelerated aging; Computer science; Degradation; Industrial control; Learning automata; Learning systems; Machine learning algorithms; Partial response channels; Service robots; Simulated annealing; Algorithms; Artificial Intelligence; Computer Simulation; Information Storage and Retrieval; Models, Theoretical;

fLanguage :

English

Journal_Title :

Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on

Publisher :

ieee

ISSN :

1083-4419

Type :

jour

DOI :

10.1109/TSMCB.2004.832154

Filename :

1335509

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1107492