DocumentCode :
1299955
Title :
A learning algorithm for the finite-time two-armed bandit problem
Author :
Sato, Mitsuhisa ; Abe, Kiyohiko ; Takeda, H.
Author_Institution :
Dept. of Electrical Engng., Tohoku Univ., Sendai, Japan
Issue :
3
fYear :
1984
Firstpage :
528
Lastpage :
534
Abstract :
A simple algorithm for the finite-time two-armed bandit problem is proposed. In this algorithm, the whole process is divided into the first estimating process and the next controlling process. Efficient methods involving the use of approximation for computing the optimal length of the estimating process are provided.
Keywords :
game theory; learning systems; approximation; controlling process; estimating process; finite-time; learning algorithm; optimal length; two-armed bandit problem; Algorithm design and analysis; Approximation methods; Estimation; Nickel; Niobium; Process control; Skeleton;
fLanguage :
English
Journal_Title :
Systems, Man and Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9472
Type :
jour
DOI :
10.1109/TSMC.1984.6313253
Filename :
6313253
Link To Document :
بازگشت