Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Why So Pessimistic? Estimating uncertainties for offline RL through Ensembles, and why their independence matters
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-005-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-005-alpha.b-cdn.net
      • sl-yoda-v2-stream-005-beta.b-cdn.net
      • 1034628162.rsc.cdn77.org
      • 1409346856.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Why So Pessimistic? Estimating uncertainties for offline RL through Ensembles, and why their independence matters
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Why So Pessimistic? Estimating uncertainties for offline RL through Ensembles, and why their independence matters

            Nov 28, 2022

            Speakers

            KG

            Kamyar Ghasemipour

            Speaker · 0 followers

            SG

            Shixiang Gu

            Speaker · 0 followers

            ON

            Ofir Nachum

            Speaker · 2 followers

            About

            Motivated by the success of ensembles for uncertainty estimation in supervised learning, we take a renewed look at how ensembles of Q-functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL). We begin by identifying a critical flaw in a popular algorithmic choice used by many ensemble-based RL algorithms, namely the use of shared pessimistic target values when computing each ensemble member’s Bellman error. Through theoretical analyses and constructio…

            Organizer

            N2
            N2

            NeurIPS 2022

            Account · 952 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Interaction-Grounded Learning with Action-inclusive Feedback
            05:25

            Interaction-Grounded Learning with Action-inclusive Feedback

            Tengyang Xie, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graph
            04:46

            Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graph

            Ruijie Wang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Panel: Faculty and Queerness
            56:11

            Panel: Faculty and Queerness

            Danica J. Sutherland, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Theory and Algorithm for Batch Distribution Drift Problems
            04:48

            Theory and Algorithm for Batch Distribution Drift Problems

            Pranjal Awasthi, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Guiding Exploration Towards Impactful Actions
            04:57

            Guiding Exploration Towards Impactful Actions

            Vaibhav Saxena, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            SageMix: Saliency-Guided Mixup for Point Clouds
            04:50

            SageMix: Saliency-Guided Mixup for Point Clouds

            Sanghyeok Lee, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2022