Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Sergey Kolesnikov · Anti-Exploration by Random Network Distillation · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Anti-Exploration by Random Network Distillation

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-004-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-004-alpha.b-cdn.net
sl-yoda-v2-stream-004-beta.b-cdn.net
1685195716.rsc.cdn77.org
1239898752.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Anti-Exploration by Random Network Distillation

Anti-Exploration by Random Network Distillation

Jul 24, 2023

Speakers

Alexander Nikulin

Řečník · 0 sledujících

Vladislav Kurenkov

Řečník · 0 sledujících

Denis Tarasov

Řečník · 0 sledujících

About

Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of-distribution actions in offline reinforcement learning. In this paper, we revisit these results and show that, with a naive choice of conditioning for the RND prior, it becomes infeasible for the actor to effectively minimize the anti-exploration bonus and discriminativity is not an issue. We show that this limitation…

Organizer

ICML 2023

Účet · 657 sledujících

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Delayed Feedback in Kernel Bandits

05:20

Delayed Feedback in Kernel Bandits

Zhlédnout později

Oblíbené

Sattar Vakili, …

ICML 2023 2 years ago

EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression

04:47

EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression

Zhlédnout později

Oblíbené

Kaja Gruntkowska, …

ICML 2023 2 years ago

Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction

04:58

Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction

Zhlédnout později

Oblíbené

Shaojie Li, …

ICML 2023 2 years ago

Large Neural Models Self-Learning Symbolic Knowledge

34:42

Large Neural Models Self-Learning Symbolic Knowledge

Zhlédnout později

Oblíbené

ICML 2023 2 years ago

Randomized Gaussian Process Upper Confidence Bound with Tight Bayesian Regret Bounds

05:05

Randomized Gaussian Process Upper Confidence Bound with Tight Bayesian Regret Bounds

Zhlédnout později

Oblíbené

Shion Takeno, …

ICML 2023 2 years ago

SlotGAT: Slot-based Message Passing for Heterogeneous Graphs

04:54

SlotGAT: Slot-based Message Passing for Heterogeneous Graphs

Zhlédnout později

Oblíbené

Ziang Zhou, …

ICML 2023 2 years ago