Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: A Policy-Guided Imitation Approach for Offline Reinforcement Learning
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-001-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-001-alpha.b-cdn.net
      • sl-yoda-v2-stream-001-beta.b-cdn.net
      • 1824830694.rsc.cdn77.org
      • 1979322955.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            A Policy-Guided Imitation Approach for Offline Reinforcement Learning
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            A Policy-Guided Imitation Approach for Offline Reinforcement Learning

            Nov 28, 2022

            Speakers

            HX

            Haoran Xu

            Speaker · 0 followers

            LJ

            Li Jiang

            Speaker · 0 followers

            JL

            Jianxiong Li

            Speaker · 0 followers

            About

            Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based and Imitation-based. RL-based methods could in principle enjoy out-of-distribution generalization but suffer from erroneous off-policy evaluation. Imitation-based methods avoid off-policy evaluation but are too conservative to surpass the dataset. In this study, we propose an alternative approach, inheriting the training stability of imitation-style methods while still allowing logical out-of-distri…

            Organizer

            N2
            N2

            NeurIPS 2022

            Account · 962 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Spatial Pruned Sparse Convolution for Efficient 3D Object Detection
            05:42

            Spatial Pruned Sparse Convolution for Efficient 3D Object Detection

            Jianhui Liu, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            “Why Not Other Classes?”: Towards Class-Contrastive Back-Propagation Explanations
            04:54

            “Why Not Other Classes?”: Towards Class-Contrastive Back-Propagation Explanations

            Yipei Wang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Beyond CAGE: Investigating Generalization of Learned Autonomous Network Defense Policies
            03:00

            Beyond CAGE: Investigating Generalization of Learned Autonomous Network Defense Policies

            Harold Nguyen, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Momentum Extragradient is Optimal for Games with Cross-Shaped Spectrum
            05:32

            Momentum Extragradient is Optimal for Games with Cross-Shaped Spectrum

            Gauthier Gidel, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            MoCoDA: Model-based Counterfactual Data Augmentation
            05:10

            MoCoDA: Model-based Counterfactual Data Augmentation

            Silviu Pitis, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Information bottleneck theory of high-dimensional regression
            00:59

            Information bottleneck theory of high-dimensional regression

            Vudtiwat Ngampruetikorn, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2022