Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: On the Estimation Bias in Double Q-Learning
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v3-stream-013-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v3-stream-013-alpha.b-cdn.net
      • sl-yoda-v3-stream-013-beta.b-cdn.net
      • 1668715672.rsc.cdn77.org
      • 1420896597.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            On the Estimation Bias in Double Q-Learning
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            On the Estimation Bias in Double Q-Learning

            Dec 6, 2021

            Speakers

            ZR

            Zhizhou Ren

            Speaker · 0 followers

            GZ

            Guangxiang Zhu

            Speaker · 0 followers

            HH

            Hao Hu

            Speaker · 0 followers

            About

            Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation. Its variants in the deep Q-learning paradigm have shown great promise in producing reliable value prediction and improving learning performance. However, as shown by prior work, double Q-learning is not fully unbiased and suffers from underestimation bias. In this paper, we show that such underestimation bias may lead to multiple non-optimal fixed…

            Organizer

            N2
            N2

            NeurIPS 2021

            Account · 1.9k followers

            About NeurIPS 2021

            Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Exploring Conceptual Soundness with TruLens
            15:54

            Exploring Conceptual Soundness with TruLens

            Anupam Datta, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Adversarial Robustness with Non-uniform Perturbations
            15:05

            Adversarial Robustness with Non-uniform Perturbations

            Ecenaz Erdemir, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
            05:06

            Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

            Robert Logan, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 1 viewers voted for saving the presentation to eternal vault which is 0.1%

            Is Importance Weighting Incompatible with Interpolating Classifiers?
            10:22

            Is Importance Weighting Incompatible with Interpolating Classifiers?

            Ke Alexander Wang, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            SWAD: Domain Generalization by Seeking Flat Minima
            11:44

            SWAD: Domain Generalization by Seeking Flat Minima

            Junbum Cha, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Online false discovery rate control for anomaly detection in time series
            14:55

            Online false discovery rate control for anomaly detection in time series

            Quentin Rebjock, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2021