مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-objective reinforcement learning algorithm and its improved convergency method

DocumentCode :

2641142

Title :

Multi-objective reinforcement learning algorithm and its improved convergency method

Author :

Jin, Zhao ; Huajun, Zhang

Author_Institution :

Dept. of Control Sci. & Eng., Huazhong Univ. of Sci. & Technol., Wuhan, China

fYear :

2011

fDate :

21-23 June 2011

Firstpage :

2438

Lastpage :

2445

Abstract :

This paper proposes a multi-objective reinforcement learning algorithm (MORLA) and uses simultaneous perturbation stochastic approximation (SPSA) to improve the convergence of it. Usually, reinforcement learning (RL) is used to design neurocontroller for control system with single objective. When facing multi-objective system, it is necessary to design the neurocontroller according to the personal preference. The MORLA can transform the multi-objective into synthetical objective and applies parallel genetic algorithm (PGA) to evolve the neurocontroller according to the synthetical objective. To establish the synthetical objective, the objective weight which represents the personal preference is calculated by solving the constrained optimization problem (COP) at the end of each generation. The COP requires not only the biggest variance of the synthetical objective in the population, but also requires the weight to fit the designer´s preference. After acquiring the weights, the PGA can select the elitists from the population according to the designer´s preference and design a satisfying neurocontroller by evolutionary operations. In addition, although GA has good global search ability, it descends slowly at local area. This paper applies SPSA algorithm to search optimal solution when GA is vibrating at local area. The SPSA converges fast by efficient gradient approximation that relies on measurements of the objective function. The hybrid algorithm accelerates the learning speed of reinforcement learning. At last, the MORLA is used to design neurocontroller for a speed-controlled induction motor drive with indirect vector control. With different personal preferences for the drive system, the simulation results show the feasibility and validity of the MORLA.

Keywords :

genetic algorithms; learning (artificial intelligence); neurocontrollers; constrained optimization problem; indirect vector control; multiobjective reinforcement learning; multiobjective system; neurocontroller; objective function; parallel genetic algorithm; simultaneous perturbation stochastic approximation; speed-controlled induction motor drive; Algorithm design and analysis; Approximation methods; Control systems; Convergence; Genetic algorithms; Learning; Neurocontrollers; SPSA; multi-objective reinforcement learning; speed-controlled;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Industrial Electronics and Applications (ICIEA), 2011 6th IEEE Conference on

Conference_Location :

Beijing

ISSN :

pending

Print_ISBN :

978-1-4244-8754-7

Electronic_ISBN :

pending

Type :

conf

DOI :

10.1109/ICIEA.2011.5976002

Filename :

5976002

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2641142