Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-007-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-007-alpha.b-cdn.net
      • sl-yoda-v2-stream-007-beta.b-cdn.net
      • 1678031076.rsc.cdn77.org
      • 1932936657.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations

            Nov 28, 2022

            Sprecher:innen

            AW

            Albert Wilcox

            Sprecher:in · 0 Follower:innen

            AB

            Ashwin Balakrishna

            Sprecher:in · 0 Follower:innen

            JD

            Jules Dedieu

            Sprecher:in · 0 Follower:innen

            Über

            Providing densely shaped reward functions for RL algorithms is often exceedingly challenging, motivating the development of RL algorithms that can learn from easier-to-specify sparse reward functions. This sparsity poses new exploration challenges; one common response is to use demonstrations to provide initial signal about regions of the state space with high rewards. However, prior RL from demonstrations algorithms introduce significant complexity and many hyperparameters, making them hard to…

            Organisator

            N2
            N2

            NeurIPS 2022

            Konto · 961 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Tikhonov Regularization is Optimal Transport Robust under Martingale Constraints
            04:54

            Tikhonov Regularization is Optimal Transport Robust under Martingale Constraints

            Jiajin Li, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Integrating eye gaze into machine learning using fractal curves
            07:11

            Integrating eye gaze into machine learning using fractal curves

            Robert Ahadizad Newport, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Analyzing Micro-Level Rebound Effects of Energy Efficient Technologies
            05:50

            Analyzing Micro-Level Rebound Effects of Energy Efficient Technologies

            Mayank Jain, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            LMPriors: Pre-Trained Language Models as Task-Specific Priors
            04:58

            LMPriors: Pre-Trained Language Models as Task-Specific Priors

            Kristy Choi, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Finding Safe Zones of Markov Decision Processes Policies
            02:16

            Finding Safe Zones of Markov Decision Processes Policies

            Michal Moshkovitz, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            VectorAdam for Rotation Equivariant Geometry Optimization
            04:45

            VectorAdam for Rotation Equivariant Geometry Optimization

            Selena Ling, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? NeurIPS 2022 folgen