Title :
Multi-objective reinforcement learning for acquiring all Pareto optimal policies simultaneously - Method of determining scalarization weights
Author :
Iima, Hitoshi ; Kuroe, Yasuaki
Author_Institution :
Dept. of Inf. Sci., Kyoto Inst. of Technol., Kyoto, Japan
Abstract :
We recently proposed a multi-objective reinforcement learning method for acquiring all Pareto optimal policies simultaneously by introducing the concept of convex hulls into Q-learning method. In this method, state-action value vectors are obtained through learning only once, and then each Pareto optimal policy is derived through scalarizing the obtained state-action value vectors by using a weight vector. The method does not require learning more than once, and finds all the Pareto optimal policies by determining weight vectors adequately and by giving them in scalarizing the obtained state-action value vectors. This paper proposes a method of determining the scalarization weight vectors. The performance of the proposed method is evaluated through numerical experiments.
Keywords :
Pareto optimisation; convex programming; learning (artificial intelligence); mathematics computing; vectors; Pareto optimal policies; Q-learning method; convex hulls; multiobjective reinforcement learning; scalarization weight vectors; state-action value vectors; Equations; Learning (artificial intelligence); Learning systems; Markov processes; Mathematical model; Pareto optimization; Vectors; Pareto optimal policy; multi-objective problem; reinforcement learning;
Conference_Titel :
Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on
Conference_Location :
San Diego, CA
DOI :
10.1109/SMC.2014.6974022