Takuya Hiraoka, Takahisa Imagawa, Voot Tangkaratt, Takayuki Osa, Takashi Onishi, Yoshimasa Tsuruoka · Meta-Model-Based Meta-Policy Optimization · SlidesLive

Kategorien

DE

Anmelden Kostenvoranschlag

Meta-Model-Based Meta-Policy Optimization

Nov 17, 2021

Sprecher:innen

Über

Model-based meta-reinforcement learning (RL) methods have recently shown to be a promising approach to improving the sample efficiency of RL in multi-task settings. However, the theoretical understanding of those methods is yet to be established, and there is currently no theoretical guarantee of their performance in a real-world environment. In this paper, we analyze the performance guarantee of model-based meta-RL methods by extending the theorems proposed by Janner et al. (2019). On the basis of our theoretical results, we propose Meta-Model-Based Meta-Policy Optimization (M3PO), a model-based meta-RL method with a performance guarantee. We demonstrate that M3PO outperforms existing meta-RL methods in continuous-control benchmarks.

Organisator

Über ACML 2021

The 13th Asian Conference on Machine Learning ACML 2021 aims to provide a leading international forum for researchers in machine learning and related fields to share their new ideas, progress and achievements.

Präsentation speichern

Soll diese Präsentation für 1000 Jahre gespeichert werden?

Wie speichern wir Präsentationen?

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind