DocumentCode :
130234
Title :
The Nash and the bandit approaches for adversarial portfolios
Author :
St-Pierre, David L. ; Teytaud, Olivier
Author_Institution :
LRI, Univ. Paris-Sud, Paris, France
fYear :
2014
fDate :
26-29 Aug. 2014
Firstpage :
1
Lastpage :
7
Abstract :
In this paper we study the use of a portfolio of policies for adversarial problems. We use two different portfolios of policies and apply it to the game of Go. The first portfolio is composed of different version of the GnuGo agent. The second portfolio is composed of fixed random seeds. First we demonstrate that learning an offline combination of these policies using the notion of Nash Equilibrium generates a stronger opponent. Second, we show that we can learn online such distributions through a bandit approach. The advantages of our approach are (i) diversity (the Nash-Portfolio is more variable than its components) (ii) adaptivity (the Bandit-Portfolio adapts to the opponent) (iii) simplicity (no computational overhead) (iv) increased performance. Due to the importance of games on mobile devices, designing artificial intelligences for small computational power is crucial; our approach is particularly suited for mobile device since it create a stronger opponent simply by biaising the distribution over the policies and moreover it generalizes quite well.
Keywords :
artificial intelligence; computer games; game theory; mobile computing; GnuGo agent; Go game; Nash equilibrium; Nash-portfolio; adaptivity; adversarial problems; artificial intelligence; bandit approach; bandit-portfolio; computational power; diversity; fixed random seeds; mobile devices; policies portfolio; simplicity; Portfolios; TV;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Games (CIG), 2014 IEEE Conference on
Conference_Location :
Dortmund
Type :
conf
DOI :
10.1109/CIG.2014.6932897
Filename :
6932897
Link To Document :
بازگشت