Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-008-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-008-alpha.b-cdn.net
      • sl-yoda-v2-stream-008-beta.b-cdn.net
      • 1159783934.rsc.cdn77.org
      • 1511376917.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction

            Dec 2, 2022

            Speakers

            JL

            Jiachen Li

            Speaker · 0 followers

            SC

            Shuo Cheng

            Speaker · 0 followers

            ZL

            Zhenyu Liao

            Speaker · 0 followers

            About

            Improving the sample efficiency of reinforcement learning algorithms requires effective exploration. Following the principle of optimism in the face of uncertainty (OFU), we train a separate exploration policy to maximize the approximate upper confidence bound of the critics in an off-policy actor-critic framework. However, this introduces extra differences between the replay buffer and the target policy regarding their stationary state-action distributions. To mitigate the off-policy-ness, we a…

            Organizer

            N2
            N2

            NeurIPS 2022

            Account · 961 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Deep Combinatorial Aggregation
            00:52

            Deep Combinatorial Aggregation

            Yuesong Shen, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Distributional Privacy for Data Sharing
            02:12

            Distributional Privacy for Data Sharing

            Zinan Lin, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Conclusion
            12:56

            Conclusion

            Michael Muller

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph Neural Networks
            04:58

            Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph Neural Networks

            Ching-Yao Chuang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport
            04:50

            OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport

            Zongsheng Cao, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Movement Penalized Bayesian Optimization with Application to Wind Energy Systems
            04:58

            Movement Penalized Bayesian Optimization with Application to Wind Energy Systems

            Shyam Ramesh, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2022