مرکز منطقه ای اطلاع رساني علوم و فناوري - Large-scale tabular-form hardware architecture for Q-Learning with delays

DocumentCode :

3268656

Title :

Large-scale tabular-form hardware architecture for Q-Learning with delays

Author :

Liu, Zhenzehn ; Elhanany, Itamar

Author_Institution :

Univ. of Tennessee, Knoxville

fYear :

2007

fDate :

5-8 Aug. 2007

Firstpage :

827

Lastpage :

830

Abstract :

Q-Learning is a popular reinforcement learning algorithm which has been widely used in stochastic control applications. The bottleneck of applying tabular form Q learning in reinforcement learning problems with large scale or high dimensional action sets is the considerable delays caused by action selection and value function updates. In this paper, we present a novel hardware architecture that significantly reduces the delays. Moreover, we formulate the Q learning algorithm in cases of observation and action delays and provide a set of proofs confirming that Q-Learning with such delays converges to the optimal policy.

Keywords :

delays; learning (artificial intelligence); Q-learning; delays; large-scale tabular-form hardware; optimal policy; reinforcement learning algorithm; stochastic control; Delay; Hardware; Iron; Large-scale systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Circuits and Systems, 2007. MWSCAS 2007. 50th Midwest Symposium on

Conference_Location :

Montreal, Que.

ISSN :

1548-3746

Print_ISBN :

978-1-4244-1175-7

Electronic_ISBN :

1548-3746

Type :

conf

DOI :

10.1109/MWSCAS.2007.4488701

Filename :

4488701

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3268656