Dez 2, 2022
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Sprecher:in · 1 Follower:in
Sprecher:in · 1 Follower:in
Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications, and in particular for visuo-motor control. Model-based RL has the potential to be highly sample efficient by concurrently learning a world model and using synthetic rollouts for planning and policy improvement. However, in practice, sample-efficient learning with model-based RL is bottlenecked by the exploration challenge. In this work, we find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL. Simply appending demonstrations to the interaction dataset, however, does not suffice. We identify key ingredients for leveraging demonstrations in model learning – policy pretraining, targeted exploration, and oversampling of demonstration data – which forms the three phases of our model-based RL framework. We empirically study three complex visuo-motor control domains and find that our method is 160Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications, and in particular for visuo-motor control. Model-based RL has the potential to be highly sample efficient by concurrently learning a world model and using synthetic rollouts for planning and policy improvement. However, in practice, sample-efficient learning with model-based RL is bottlenecked by the exploration challenge. In this work, we find t…
Konto · 961 Follower:innen
Professionelle Aufzeichnung und Livestreaming – weltweit.
Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind
Xuran Pan, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Siqi Zhang, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Qinzi Zhang, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%