Dec 6, 2021
Řečník · 0 sledujících
Řečník · 0 sledujících
Řečník · 0 sledujících
Most existing imitation learning approaches assume the demonstrations are drawn from experts who are optimal, but relaxing this assumption enables us to tackle a much wider range of data. Standard imitation learning fails when learning from demonstrations with varying optimality, and only learns suboptimal policies. Previous works use confidence scores or rankings to capture beneficial information from demonstrations with varying optimality, but they suffer from many limitations, e.g., manually annotated confidence scores or strong assumptions on the environments. In this paper, we propose a general framework for imitation learning from demonstrations with varying optimality that jointly learns the confidence score and a well-performing policy. Our approach, Confidence-Aware Imitation Learning (CAIL) learns a well-performing policy from confidence-reweighted demonstrations, while uses an outer loss to track the performance of our model and learn the confidence. We provide theoretical guarantees on the convergence of CAIL and evaluate its performance in several simulated environments as well as a real robot experiment. Our results demonstrate that CAIL significantly outperforms other imitation learning methods from demonstrations with varying optimality. We also demonstrate that even without access to any optimal demonstrations, our algorithm can still learn a successful policy, and outperforms prior work.Most existing imitation learning approaches assume the demonstrations are drawn from experts who are optimal, but relaxing this assumption enables us to tackle a much wider range of data. Standard imitation learning fails when learning from demonstrations with varying optimality, and only learns suboptimal policies. Previous works use confidence scores or rankings to capture beneficial information from demonstrations with varying optimality, but they suffer from many limitations, e.g., manually…
Účet · 1,9k sledujících
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Nic Jedema, …
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Yicheng Luo, …
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %