Takuya Hiraoka, Takahisa Imagawa, Voot Tangkaratt, Takayuki Osa, Takashi Onishi, Yoshimasa Tsuruoka · Meta-Model-Based Meta-Policy Optimization · SlidesLive

Categories

EN

Log in Get an estimate

Meta-Model-Based Meta-Policy Optimization

Nov 17, 2021

Speakers

About

Model-based meta-reinforcement learning (RL) methods have recently shown to be a promising approach to improving the sample efficiency of RL in multi-task settings. However, the theoretical understanding of those methods is yet to be established, and there is currently no theoretical guarantee of their performance in a real-world environment. In this paper, we analyze the performance guarantee of model-based meta-RL methods by extending the theorems proposed by Janner et al. (2019). On the basis of our theoretical results, we propose Meta-Model-Based Meta-Policy Optimization (M3PO), a model-based meta-RL method with a performance guarantee. We demonstrate that M3PO outperforms existing meta-RL methods in continuous-control benchmarks.

Organizer

About ACML 2021

The 13th Asian Conference on Machine Learning ACML 2021 aims to provide a leading international forum for researchers in machine learning and related fields to share their new ideas, progress and achievements.

Store presentation

Should this presentation be stored for 1000 years?

How do we store presentations

Sharing

Recommended Videos

Presentations on similar topic, category or speaker