DocumentCode :
1055090
Title :
Ergodic Learning Automata Capable of Incorporating a Priori Information
Author :
Oommen, B.J.
Author_Institution :
School of Computer Science, Carleton University, Ottawa, Canada K1S 5B6
Volume :
17
Issue :
4
fYear :
1987
fDate :
7/1/1987 12:00:00 AM
Firstpage :
717
Lastpage :
723
Abstract :
Learning automata are considered which update their action probabilities on the basis of the responses they get from a random environment. The automata update the probabilities whether the environment responds with a reward or a penalty. Learning automata are said to be ergodic if the distribution of the limiting action probability vector is independent of the initial distribution. An ergodic scheme is presented which can take into consideration a priori information about the action probabilities. This is the only reported scheme in the literature capable of achieving this. The mean and the variance of the limiting distribution of the automaton is derived, and it is shown that the mean is not independent of the a priori information. Further, it is shown that the expressions for the foregoing quantities are general cases of the corresponding quantities derived for the familiar LRP scheme. Finally, it is shown that by constantly updating the parameter quantifying the a priori information, a resultant linear scheme can be obtained. This scheme is of a reward-reward flavor and yet is absolutely expedient. It falls within the class of absolutely expedient schemes presented by Aso and Kimura.
Keywords :
Automatic testing; Biological system modeling; Councils; Learning automata; Machine learning; Pattern recognition; Routing; Stochastic processes; System testing; Telephony;
fLanguage :
English
Journal_Title :
Systems, Man and Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9472
Type :
jour
DOI :
10.1109/TSMC.1987.289367
Filename :
4075690
Link To Document :
بازگشت