Training graph neural networks with policy gradients to perform tree search

2. Prosinec 2022

Řečníci

O prezentaci

Monte Carlo Tree Search has shown to be a well-performing approach for decision problems such as board games and Atari games, but relies on heuristic design decisions that are non-adaptive and not necessarily optimal for all problems. Learned policies and value functions can augment MCTS by leveraging the state information at the nodes in the search tree. However, these learned functions do not take the search tree structure into account and can be sensitive to value estimation errors. In this paper, we propose a new method that, using Reinforcement Learning, learns how to expand the search tree and make decisions using Graph Neural Networks. This enables the policy to fully leverage the search tree and learn how to search based on the specific problem. Firstly, we show in an environment where state information is limited that the policy is able to leverage information from the search tree. Concluding, we find that the method outperforms popular baselines on two diverse and problems known to require planning: Sokoban and the Travelling salesman problem.

Organizátor

Uložení prezentace

Měla by být tato prezentace uložena po dobu 1000 let?

Jak ukládáme prezentace

Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Zajímají Vás podobná videa? Sledujte NeurIPS 2022