Dec 6, 2021
Pre-trained, low-level skills for reinforcement learning provide higher-level action spaces with the potential to facilitate exploration. The design of such skills presents an inductive bias and is therefore subject to a trade-off between faster learning and generality across environments. In prior work on continuous control, the sensitivity of methods to this inductive bias has not been addressed explicitly, as locomotion provides a suitable prior for navigation tasks, which have been of foremost interest. In this work, we introduce a benchmark suite of sparse-reward tasks for bipedal robots demanding a variety of motor abilities. This allows us to clearly expose the consequences of trading off exploration benefits and fine-grained control. We propose a novel hierarchical skill learning framework that offloads this trade-off to high-level policy training and which produces skills that are useful across a wide range of environments. Finally, we present a three-layered hierarchical learning algorithm to perform this trade-off automatically, outperforming existing approaches to end-to-end hierarchical reinforcement learning and unsupervised skill discovery.
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker