Dec 6, 2021
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Prior AI successes in complex games have largely focused on settings with at most hundreds of actions at each decision point.In contrast, Diplomacy is a game with more than 10^20 possible actions per turn. Previous attempts to address games with large branching factor, such as Diplomacy, StarCraft and Dota, used human data to bootstrap the policy or used handcrafted reward shaping. In this paper we describe an algorithm for action exploration and equilibrium approximation in games with combinatorial action spaces. This algorithm simultaneously performs value iteration RL while learning a policy proposal network. A double oracle step is used to explore additional actions to add to the policy proposals. At each state, the target state value and policy for RL training are computed via an equilibrium search procedure. Using this algorithm, we train an agent, DORA, completely from scratch for a popular two-player variant of Diplomacy and show that it achieves superhuman performance. Additionally, we extend our methods to full-scale no-press Diplomacy and for the first time train an agent from scratch with no human data. We show this agent differs radically from past agents that required human data.Prior AI successes in complex games have largely focused on settings with at most hundreds of actions at each decision point.In contrast, Diplomacy is a game with more than 10^20 possible actions per turn. Previous attempts to address games with large branching factor, such as Diplomacy, StarCraft and Dota, used human data to bootstrap the policy or used handcrafted reward shaping. In this paper we describe an algorithm for action exploration and equilibrium approximation in games with combinato…
Account · 1.9k followers
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Zhenyu Liao, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Devina Mohan, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Zhilei Wang, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Kenny Peng, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Xingchao Liu, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Omar Fawzi, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%