DocumentCode :
1912946
Title :
Optimal learning of transition probabilities in the two-agent newsvendor problem
Author :
Ryzhov, Ilya O. ; Valdez-Vivas, Martin R. ; Powell, Warren B.
Author_Institution :
Oper. Res. & Financial Eng., Princeton Univ., Princeton, NJ, USA
fYear :
2010
fDate :
5-8 Dec. 2010
Firstpage :
1088
Lastpage :
1098
Abstract :
We examine a newsvendor problem with two agents: a requesting agent that observes private demand information, and an oversight agent that must determine how to allocate resources upon receiving a bid from the requesting agent. Because the two agents have different cost structures, the requesting agent tends to bid higher than the amount that is actually needed. As a result, the allocating agent needs to adaptively learn how to interpret the bids and estimate the requesting agent´s biases. Learning must occur as quickly as possible, because each suboptimal resource allocation incurs an economic cost. We present a mathematical model that casts the problem as a Markov decision process with unknown transition probabilities. We then perform a simulation study comparing four different techniques for optimal learning of transition probabilities. The best technique is shown to be a knowledge gradient algorithm, based on a one-period look-ahead approach.
Keywords :
Markov processes; decision theory; learning (artificial intelligence); multi-agent systems; operations research; probability; resource allocation; Markov decision process; knowledge gradient algorithm; mathematical model; one period look ahead approach; optimal learning; oversight agent; private demand information; suboptimal resource allocation; transition probability; two agent newsvendor problem; unknown transition probabilities; Approximation methods; Bayesian methods; Games; History; Markov processes; Mathematical model; Resource management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Simulation Conference (WSC), Proceedings of the 2010 Winter
Conference_Location :
Baltimore, MD
ISSN :
0891-7736
Print_ISBN :
978-1-4244-9866-6
Type :
conf
DOI :
10.1109/WSC.2010.5679081
Filename :
5679081
Link To Document :
بازگشت