DocumentCode :
188650
Title :
nso-HSVI: A Not-So-Optimistic Heuristic Search Value Iteration Algorithm for POMDPs
Author :
Feng Liu ; Haibo Li ; Chongjun Wang
Author_Institution :
Nat. Key Lab. for Novel Software Technol. Software Inst., Nanjing Univ., Nanjing, China
fYear :
2014
fDate :
10-12 Nov. 2014
Firstpage :
689
Lastpage :
693
Abstract :
Point-based value iteration methods improve computational efficiency by reducing the search space size. Although global optimization can be obtained by algorithms such as HSVI and GapMin, their exploration of the optimal action is overly optimistic which therefore slows down the efficiency. In this paper, we propose a novel heuristic search method nso-HSVI (not-so-optimistic Heuristic Search Value Iteration) which uses a Monte-Carlo method to estimate the probabilities that actions are optimal according to the distribution of actions´ Q-value function and applies the action of the maximum probability. Experimental results show that nso-HSVI outperforms HSVI, and by a large margin when the scale of the POMDP increases.
Keywords :
Markov processes; Monte Carlo methods; iterative methods; optimisation; probability; GapMin; Monte-Carlo method; POMDP; Q-value function; heuristic search method; not-so-optimistic heuristic search value iteration; nso-HSVI; optimization; partially observable Markov decision processes; point-based value iteration methods; probability; search space size; Algorithm design and analysis; Approximation algorithms; Approximation methods; Convergence; Probability density function; Upper bound; Vectors; POMDP; nso-HSVI;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
Conference_Location :
Limassol
ISSN :
1082-3409
Type :
conf
DOI :
10.1109/ICTAI.2014.108
Filename :
6984544
Link To Document :
بازگشت