Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-010-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-010-alpha.b-cdn.net
      • sl-yoda-v2-stream-010-beta.b-cdn.net
      • 1759419103.rsc.cdn77.org
      • 1016618226.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning

            Nov 28, 2022

            Speakers

            BL
            BL

            Bo Liu

            Sprecher:in · 1 Follower:in

            XF

            Xidong Feng

            Sprecher:in · 0 Follower:innen

            JR

            Jie Ren

            Sprecher:in · 0 Follower:innen

            About

            Gradient-based Meta-RL (GMRL) refers to methods that maintain two-level optimisation procedures wherein the outer-loop meta-learner guides the inner-loop gradient-based reinforcement learner to achieve fast adaptations. In this paper, we develop a unified framework that describes variations of GMRL algorithms and points out that existing stochastic meta-gradient estimators adopted by GMRL are actually biased. Such meta-gradient bias comes from two sources: 1) the compositional bias incurred by t…

            Organizer

            N2
            N2

            NeurIPS 2022

            Konto · 961 Follower:innen

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Provable Benefits of Representational Transfer in RL
            08:31

            Provable Benefits of Representational Transfer in RL

            Alekh Agarwal, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards
            04:38

            Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards

            Rati Devidze, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Multi-dataset Training of Transformers for Robust Action Recognition
            04:55

            Multi-dataset Training of Transformers for Robust Action Recognition

            Junwei Liang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            You Never Stop Dancing: Non-freezing Dance Generation via Bank-constrained Manifold Projection
            01:04

            You Never Stop Dancing: Non-freezing Dance Generation via Bank-constrained Manifold Projection

            Jiangxin Sun, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Sparse Probabilistic Circuits via Pruning and Growing
            05:04

            Sparse Probabilistic Circuits via Pruning and Growing

            Meihua Dang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Conformal Prediction with Temporal Quantile Adjustments
            04:48

            Conformal Prediction with Temporal Quantile Adjustments

            Zhen Lin, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interested in talks like this? Follow NeurIPS 2022