مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning automata processing ergodicity of the mean: The two-action case

DocumentCode :

1299566

Title :

Learning automata processing ergodicity of the mean: The two-action case

Author :

L. Thathachar, M. ; Oommen, B. John

Author_Institution :

Dept. of Electrical Engng., Indian Inst. of Sci., Bangalore, India

Issue :

fYear :

1983

Firstpage :

1143

Lastpage :

1148

Abstract :

Learning automata which update their action probabilities on the basis of the responses they get from an environment are considered. The automata update the probabilities whether the environment responds with a reward or a penalty. An automation is said to possess ergodicity of the mean (EM) if the mean action probability is the total state probability of an ergodic Markov chain. The only known EM algorithm is the linear reward-penalty (L_RP) scheme. For the two-action case, necessary and sufficient conditions have been derived for nonlinear updating schemes to be EM. The method of controlling the rate of convergence of this scheme is presented. In particular, a generalized linear algorithm has been proposed which is superior to the L_RP scheme. The expression for the variance of the limiting action probabilities of this scheme is derived.

Keywords :

adaptive systems; automata theory; learning systems; adaptive systems; ergodic Markov chain; ergodicity of the mean; learning automata; learning systems; linear reward-penalty; two-action case; Automata; Convergence; Discrete Fourier transforms; Learning automata; Limiting; Markov processes; Vectors;

fLanguage :

English

Journal_Title :

Systems, Man and Cybernetics, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9472

Type :

jour

DOI :

10.1109/TSMC.1983.6313191

Filename :

6313191

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1299566