DocumentCode
554139
Title
Notice of Retraction
Shaping agent by critical states
Author
Jiong Song ; Jin Zhao
Author_Institution
Yunnan Jiao Tong Vocational & Tech. Coll., Kunming, China
Volume
3
fYear
2011
fDate
26-28 July 2011
Firstpage
1314
Lastpage
1317
Abstract
Notice of Retraction
After careful and considered review of the content of this paper by a duly constituted expert committee, this paper has been found to be in violation of IEEE´s Publication Principles.
We hereby retract the content of this paper. Reasonable effort should be made to remove all past references to this paper.
The presenting author of this paper has the option to appeal this decision by contacting TPII@ieee.org.
Shaping is a promising technique for scaling Reinforcement Learning to large and complex problems. But the design and tune of shaping reward are difficult and problem-oriented. We propose an approach to make agent can shape itself by critical states, which are found by agent itself from prior learning. We accumulate the state trajectories that agent experienced in every training episode, and eliminate the state loops existed in the original state trajectories, then the acyclic state trajectories are used to find the critical states. The critical state is a state that has high probability to appear in all these acyclic state trajectories, that means, if agent wants to reach the goal state, then it would have high probability to pass the critical states. So the critical states can be used to shape agent reaching the goal state faster. The Grid-World problem is used to illustrate the applicability and effectiveness of our approach. The more important is our approach makes agent can shape itself by what it learned.
After careful and considered review of the content of this paper by a duly constituted expert committee, this paper has been found to be in violation of IEEE´s Publication Principles.
We hereby retract the content of this paper. Reasonable effort should be made to remove all past references to this paper.
The presenting author of this paper has the option to appeal this decision by contacting TPII@ieee.org.
Shaping is a promising technique for scaling Reinforcement Learning to large and complex problems. But the design and tune of shaping reward are difficult and problem-oriented. We propose an approach to make agent can shape itself by critical states, which are found by agent itself from prior learning. We accumulate the state trajectories that agent experienced in every training episode, and eliminate the state loops existed in the original state trajectories, then the acyclic state trajectories are used to find the critical states. The critical state is a state that has high probability to appear in all these acyclic state trajectories, that means, if agent wants to reach the goal state, then it would have high probability to pass the critical states. So the critical states can be used to shape agent reaching the goal state faster. The Grid-World problem is used to illustrate the applicability and effectiveness of our approach. The more important is our approach makes agent can shape itself by what it learned.
Keywords
grid computing; learning (artificial intelligence); probability; software agents; acyclic state; critical states; grid-world problem; probability; reinforcement learning scaling; shaping agent; state loop elimination; Algorithm design and analysis; Humans; Learning; Machine learning; Shape; Training; Trajectory;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Computation (ICNC), 2011 Seventh International Conference on
Conference_Location
Shanghai
ISSN
2157-9555
Print_ISBN
978-1-4244-9950-2
Type
conf
DOI
10.1109/ICNC.2011.6022342
Filename
6022342
Link To Document