DocumentCode
3269713
Title
Reinforcement learning to train Ms. Pac-Man using higher-order action-relative inputs
Author
Bom, Luuk ; Henken, Ruud ; Wiering, Marco
Author_Institution
Inst. of Artificial Intell. & Cognitive Eng., Univ. of Groningen, Groningen, Netherlands
fYear
2013
fDate
16-19 April 2013
Firstpage
156
Lastpage
163
Abstract
Reinforcement learning algorithms enable an agent to optimize its behavior from interacting with a specific environment. Although some very successful applications of reinforcement learning algorithms have been developed, it is still an open research question how to scale up to large dynamic environments. In this paper we will study the use of reinforcement learning on the popular arcade video game Ms. Pac-Man. In order to let Ms. Pac-Man quickly learn, we designed particular smart feature extraction algorithms that produce higher-order inputs from the game-state. These inputs are then given to a neural network that is trained using Q-learning. We constructed higher-order features which are relative to the action of Ms. Pac-Man. These relative inputs are then given to a single neural network which sequentially propagates the action-relative inputs to obtain the different Q-values of different actions. The experimental results show that this approach allows the use of only 7 input units in the neural network, while still quickly obtaining very good playing behavior. Furthermore, the experiments show that our approach enables Ms. Pac-Man to successfully transfer its learned policy to a different maze on which it was not trained before.
Keywords
computer games; feature extraction; learning (artificial intelligence); neural nets; Q-learning; arcade video game Ms. Pac-Man; dynamic environments; game state; higher order action relative inputs; neural network; open research question; reinforcement learning algorithms; smart feature extraction algorithms; train Ms. Pac-Man; Biological neural networks; Games; Heuristic algorithms; Learning (artificial intelligence); Neurons; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2013 IEEE Symposium on
Conference_Location
Singapore
ISSN
2325-1824
Type
conf
DOI
10.1109/ADPRL.2013.6615002
Filename
6615002
Link To Document