Title :
Emergence of flexible prediction-based discrete decision making and continuous motion generation through actor-Q-learning
Author :
Shibata, Kenji ; Goto, Keisuke
Author_Institution :
Dept. of Electr. & Electron. Eng., Oita Univ., Oita, Japan
Abstract :
In this paper, the authors first point the importance of three factors for filling the gap between humans and robots in the flexibility in the real world. Those are (1)parallel processing, (2)emergence through learning and solving “what” problems, and (3)abstraction and generalization on the abstract space. To explore the possibility of human-like flexibility in robots, a prediction-required task in which an agent (robot) gets a reward by capturing a moving target that sometimes becomes invisible was learned by reinforcement learning using a recurrent neural network. Even though the agent did not know in advance that “prediction is required” or “what information should be predicted”, appropriate discrete decision making, in which `capture´ or `move´ was chosen, and also continuous motion generation in two-dimensional space, could be acquired. Furthermore, in this task, the target sometimes changed its moving direction randomly when it became visible again from invisible state. Then the agent could change its moving direction promptly and appropriately without introducing any special architecture or technique. Such emergent property is what general parallel processing systems such as Subsumption architecture do not have, and the authors believe it is a key to solve the “Frame Problem” fundamentally.
Keywords :
continuous systems; discrete systems; generalisation (artificial intelligence); learning systems; motion control; neurocontrollers; predictive control; problem solving; recurrent neural nets; robots; 2D space continuous motion generation; abstract space; abstraction; actor-Q-learning; agent moving direction changing; flexible prediction-based discrete decision making; frame problem; generalization; human-like flexibility; parallel processing; recurrent neural network; reinforcement learning; subsumption architecture; what problem solving; Neurons; Recurrent neural networks; Robot sensing systems; Timing; Training; Vectors;
Conference_Titel :
Development and Learning and Epigenetic Robotics (ICDL), 2013 IEEE Third Joint International Conference on
Conference_Location :
Osaka
DOI :
10.1109/DevLrn.2013.6652559