Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: A Connection between One-Step RL and Critic Regularization in Reinforcement Learning
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-010-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-010-alpha.b-cdn.net
      • sl-yoda-v2-stream-010-beta.b-cdn.net
      • 1759419103.rsc.cdn77.org
      • 1016618226.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            A Connection between One-Step RL and Critic Regularization in Reinforcement Learning
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            A Connection between One-Step RL and Critic Regularization in Reinforcement Learning

            Jul 24, 2023

            Speakers

            BE

            Benjamin Eysenbach

            Speaker · 0 followers

            MG

            Matthieu Geist

            Speaker · 0 followers

            SL

            Sergey Levine

            Speaker · 1 follower

            About

            As with any machine learning problem with limited data, effective offline RL algorithms require careful regularization to avoid overfitting. One class of methods, known as one-step RL, perform just one step of policy improvement. These methods, which include advantage-weighted regression and conditional behavioral cloning, are thus simple and stable, but can have limited asymptotic performance. A second class of methods, known as critic regularization, perform many steps of policy improvement wi…

            Organizer

            I2
            I2

            ICML 2023

            Account · 657 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            A Kernel-Based View of Language Model Fine-Tuning
            05:09

            A Kernel-Based View of Language Model Fine-Tuning

            Sadhika Malladi, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes
            04:41

            Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes

            Amir H. Saberi, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            DIVISION: Memory Efficient Training via Dual Activation Precision
            04:58

            DIVISION: Memory Efficient Training via Dual Activation Precision

            Guanchu Wang, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Generalization and Corruption Resistance via Distributionally Robust Optimization
            05:46

            Generalization and Corruption Resistance via Distributionally Robust Optimization

            Amine Bennouna, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Looped Transformers as Programmable Computers
            05:12

            Looped Transformers as Programmable Computers

            Angeliki Giannou, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation
            05:15

            In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation

            Alicia Curth, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow ICML 2023