Title :
Multi-objective reinforcement learning algorithm and its improved convergency method
Author :
Jin, Zhao ; Huajun, Zhang
Author_Institution :
Dept. of Control Sci. & Eng., Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
This paper proposes a multi-objective reinforcement learning algorithm (MORLA) and uses simultaneous perturbation stochastic approximation (SPSA) to improve the convergence of it. Usually, reinforcement learning (RL) is used to design neurocontroller for control system with single objective. When facing multi-objective system, it is necessary to design the neurocontroller according to the personal preference. The MORLA can transform the multi-objective into synthetical objective and applies parallel genetic algorithm (PGA) to evolve the neurocontroller according to the synthetical objective. To establish the synthetical objective, the objective weight which represents the personal preference is calculated by solving the constrained optimization problem (COP) at the end of each generation. The COP requires not only the biggest variance of the synthetical objective in the population, but also requires the weight to fit the designer´s preference. After acquiring the weights, the PGA can select the elitists from the population according to the designer´s preference and design a satisfying neurocontroller by evolutionary operations. In addition, although GA has good global search ability, it descends slowly at local area. This paper applies SPSA algorithm to search optimal solution when GA is vibrating at local area. The SPSA converges fast by efficient gradient approximation that relies on measurements of the objective function. The hybrid algorithm accelerates the learning speed of reinforcement learning. At last, the MORLA is used to design neurocontroller for a speed-controlled induction motor drive with indirect vector control. With different personal preferences for the drive system, the simulation results show the feasibility and validity of the MORLA.
Keywords :
genetic algorithms; learning (artificial intelligence); neurocontrollers; constrained optimization problem; indirect vector control; multiobjective reinforcement learning; multiobjective system; neurocontroller; objective function; parallel genetic algorithm; simultaneous perturbation stochastic approximation; speed-controlled induction motor drive; Algorithm design and analysis; Approximation methods; Control systems; Convergence; Genetic algorithms; Learning; Neurocontrollers; SPSA; multi-objective reinforcement learning; speed-controlled;
Conference_Titel :
Industrial Electronics and Applications (ICIEA), 2011 6th IEEE Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-8754-7
Electronic_ISBN :
pending
DOI :
10.1109/ICIEA.2011.5976002