Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Anti-Exploration by Random Network Distillation
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-004-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-004-alpha.b-cdn.net
      • sl-yoda-v2-stream-004-beta.b-cdn.net
      • 1685195716.rsc.cdn77.org
      • 1239898752.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Anti-Exploration by Random Network Distillation
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Anti-Exploration by Random Network Distillation

            Jul 24, 2023

            Speakers

            AN

            Alexander Nikulin

            Řečník · 0 sledujících

            VK

            Vladislav Kurenkov

            Řečník · 0 sledujících

            DT

            Denis Tarasov

            Řečník · 0 sledujících

            About

            Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of-distribution actions in offline reinforcement learning. In this paper, we revisit these results and show that, with a naive choice of conditioning for the RND prior, it becomes infeasible for the actor to effectively minimize the anti-exploration bonus and discriminativity is not an issue. We show that this limitation…

            Organizer

            I2
            I2

            ICML 2023

            Účet · 657 sledujících

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Delayed Feedback in Kernel Bandits
            05:20

            Delayed Feedback in Kernel Bandits

            Sattar Vakili, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression
            04:47

            EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression

            Kaja Gruntkowska, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction
            04:58

            Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction

            Shaojie Li, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Large Neural Models Self-Learning Symbolic Knowledge
            34:42

            Large Neural Models Self-Learning Symbolic Knowledge

            Heng Ji, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Randomized Gaussian Process Upper Confidence Bound with Tight Bayesian Regret Bounds
            05:05

            Randomized Gaussian Process Upper Confidence Bound with Tight Bayesian Regret Bounds

            Shion Takeno, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            SlotGAT: Slot-based Message Passing for Heterogeneous Graphs
            04:54

            SlotGAT: Slot-based Message Passing for Heterogeneous Graphs

            Ziang Zhou, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Interested in talks like this? Follow ICML 2023