DocumentCode
1055090
Title
Ergodic Learning Automata Capable of Incorporating a Priori Information
Author
Oommen, B.J.
Author_Institution
School of Computer Science, Carleton University, Ottawa, Canada K1S 5B6
Volume
17
Issue
4
fYear
1987
fDate
7/1/1987 12:00:00 AM
Firstpage
717
Lastpage
723
Abstract
Learning automata are considered which update their action probabilities on the basis of the responses they get from a random environment. The automata update the probabilities whether the environment responds with a reward or a penalty. Learning automata are said to be ergodic if the distribution of the limiting action probability vector is independent of the initial distribution. An ergodic scheme is presented which can take into consideration a priori information about the action probabilities. This is the only reported scheme in the literature capable of achieving this. The mean and the variance of the limiting distribution of the automaton is derived, and it is shown that the mean is not independent of the a priori information. Further, it is shown that the expressions for the foregoing quantities are general cases of the corresponding quantities derived for the familiar LRP scheme. Finally, it is shown that by constantly updating the parameter quantifying the a priori information, a resultant linear scheme can be obtained. This scheme is of a reward-reward flavor and yet is absolutely expedient. It falls within the class of absolutely expedient schemes presented by Aso and Kimura.
Keywords
Automatic testing; Biological system modeling; Councils; Learning automata; Machine learning; Pattern recognition; Routing; Stochastic processes; System testing; Telephony;
fLanguage
English
Journal_Title
Systems, Man and Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
0018-9472
Type
jour
DOI
10.1109/TSMC.1987.289367
Filename
4075690
Link To Document