Dec 2, 2022
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
In Reinforcement Learning, Intrinsic Motivation motivates directed behaviors through a wide range of reward-generating methods. Depending on the task and environment, these rewards can be useful, might complement each other, but can also break down entirely, as seen with the noisy TV problem for curiosity. We therefore argue that scalability and robustness, among others, are key desirable properties of a method to incorporate intrinsic rewards, which a simple weighted sum of reward lacks. In a tabular setting, Explore Options let the agent call an intrinsically motivated policy in order to learn from its trajectories. We introduce Deep Explore Options, revising Explore Options within the Deep Reinforcement Learning paradigm to tackle complex visual problems. Deep Explore Options can naturally learn from several unrelated intrinsic rewards, ignore harmful intrinsic rewards, learn to balance exploration, but also isolate exploitative and exploratory behaviors for independent usage. We test Deep Explore Options on hard and easy exploration games of the Atari Suite, following a benchmarking study to ensure fairness. Our empirical results show that they achieve similar results than weighted sum baselines, while maintaining their key properties.In Reinforcement Learning, Intrinsic Motivation motivates directed behaviors through a wide range of reward-generating methods. Depending on the task and environment, these rewards can be useful, might complement each other, but can also break down entirely, as seen with the noisy TV problem for curiosity. We therefore argue that scalability and robustness, among others, are key desirable properties of a method to incorporate intrinsic rewards, which a simple weighted sum of reward lacks. In a t…
Account · 961 followers
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Yichao Liang, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Paul Mangold, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Hanqing Wang, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Zhiwei Deng, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%