DocumentCode
3454428
Title
A unifying framework for HAMs-family HRL methods
Author
Du Xiaoqin ; Qinghua, Li ; Jianjun, Han
Author_Institution
Coll. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan
fYear
2007
fDate
15-18 Dec. 2007
Firstpage
1978
Lastpage
1982
Abstract
In the HRL (hierarchical reinforcement learning) field, there are three main methods such as HAMs (hierarchical abstract machines), options, MAXQ. These methods all rely on the theory of SMDPs. While the SMDP framework allows us to directly model the high-level actions that take varying amounts of time, it provides little in the way of concrete representational guidance, which is critical from a computational and analytical point of view. In particular, the SMDP framework does not specify how the overall task can be decomposed into a collection of subtasks, which is important for us to do state abstraction and subtask sharing for individual subtask or module. In addition, we also want to choose between hierarchical optimality and recursive optimality for a given hierarchy on our problem. This paper introduces a unifying framework for HAMs-family methods. Based on this framework, we can define HAMs or sub- HAM homomorphism for state abstraction and can also freely select alternative policy optimality.
Keywords
finite automata; learning (artificial intelligence); HAMs-family HRL methods; concrete representational guidance; hierarchical abstract machines; hierarchical optimality; hierarchical reinforcement learning; recursive optimality; state abstraction; subHAM homomorphism; subtask sharing; Automata; Biomimetics; Computer science; Concrete; Educational institutions; Machine learning; Power system modeling; Robots; State-space methods; Stochastic processes; HAMs; Hierarchical Reinforcement Learning; Reinforcement Learning; SMDPs;
fLanguage
English
Publisher
ieee
Conference_Titel
Robotics and Biomimetics, 2007. ROBIO 2007. IEEE International Conference on
Conference_Location
Sanya
Print_ISBN
978-1-4244-1761-2
Electronic_ISBN
978-1-4244-1758-2
Type
conf
DOI
10.1109/ROBIO.2007.4522470
Filename
4522470
Link To Document