Title :
Adaptive Profit Sharing Reinforcement Learning Method for Dynamic Environment
Author :
Koujaku, Sadamori ; Watanabe, Kota ; Igarashi, Hajime
Author_Institution :
Grad. Sch. of Inf. Sci. Technol., Hokkaido Univ., Sapporo, Japan
Abstract :
In this paper, an Adaptive Forgettable Profit Sharing reinforcement learning method is introduced. This method enables agents to adapt the environmental changes very quickly. It can be used to learn the robust and effective actions in the uncertain environments which have the non-Markov property, especially the partial observable Markov process (POMDP). Profit Sharing learns rational policy that is easy to be learned and results in good behavior in POMDP. However, the policy becomes worse in the dynamic and huge environment that changes frequently and require the lots of actions to achieve the goal. In order to handle such kind of environment, the forgetting, which gives the adaptability and rationality to Profit Sharing, is implemented. This method allows the agent to forget past experiences that reduce the rationality of its policy. The usefulness of the proposed algorithm is demonstrated through the numerical examples.
Keywords :
Markov processes; incentive schemes; learning (artificial intelligence); adaptive forgettable profit sharing reinforcement learning method; dynamic environment; nonMarkov property; partial observable Markov process; Educational institutions; Information science; Learning; Learning systems; Markov processes; Robustness; Reinforcement Learning; forgetting; rational theorem;
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
DOI :
10.1109/ICMLA.2011.25